Bioinformatical design of a vaccine against influenza virus N1 subtype Bonaccorsi, Irene; Clausen, Martin Bau; Høj, Leif Howalt; Kjær, Jesper and Sayyad, Fhayaz Ahammad Introduction Morbidity and mortality of influenza 5-20% of all people in the US are infected with influenza every year, 200,000 are hospitalised and 36,000 die [1]. This makes flu a serious pathogen that causes massive losses in productivity in the industrialised world and even large number of deaths. Influenza virus Human influenza viruses are members of the orthomyxovirus family, which consists of: influenza A, B, and C vira, and Thogovirus (in ticks). In humans, only influenza A and B viruses are of epidemiological interest. The main antigenic determinants of influenza A and B viruses are the hemagglutinin (HA or N) and neuraminidase (NA or N) transmembrane glycoproteins. Based on the antigenicity of these glycoproteins, influenza A viruses are further subdivided into sixteen H (H1-H16) and nine N (N1-N9) subtypes. Neuraminidase Like HA, neuraminidase is a glycoprotein, which is also found as projections on the surface of the virus where it forms a tetrameric structure. The NA molecule presents its main part at the outer surface of the cell, spans the lipid layer, and has a small cytoplasmic tail. NA acts as an enzyme, cleaving sialic acid from the HA molecule, from other NA molecules and from glycoproteins and glycolipids at the cell surface. It also serves as an important antigenic site, and in addition, seems to be necessary for the penetration of the virus through the mucin layer of the respiratory epithelium. Antigenic drift can also occur in the NA. The NA carries several important amino acid residues which, if they mutate, can lead to resistance against neuraminidase inhibitors. gi|9gi|10 05 6 gi|9 72583964 g i|9 112 0| g4b2| g gi|9 11 19 0840| |ABbD|A b|A 9B 53F0828 gi|9 0572140 8|g bg| A |gb|ABEB1E 1 9| 8 gi|9005577239375|g BD1 731859/ H81| / H N 2 7 1 | 2 |A gi|905 0 /H /H N 7 B 2 /N |gbb|A D99552 5| 4 1/H 11N N e1w 188| /NZe BD 1 3 gi|905772 9512 2|| /H /N 1/N |AB N e11w ewZaZlae 1616|ggb D95 2 1 w | /H /NZew gi|905 N /H1 111| 11/N 7237 a la N 8|gb|Ab|ABD9478 Z ea al 1N /Neeew BD e e la 1/ 1| Z gi|90572283 a w N 22 /H e la 1| ew 1N a /H1 |gb|ABD95 1/ la Z N1 Ne ea /Ne 95166| la w Ze Ze /H1N gi|77747099|gb ala ala 1/Ne | ABB029 ww Zea 39| /H1N1/ USA/20 01/ la gi|74477194|gb| ABA08467| /H1N1/USA/2001/ 01/ | /H1N1/US A/2001/ 1/ A/20 00 1/US gi|7754324 7|gb| ABA87048 A/2 /US 1// / SA 01/ /200 /H1N1 001 A/20 7| /H1N 32|gb|ABA42578| 1/U 81 S /2 1N U A gi|76443585 S 9|gb|ABB02 lovalk ia 1/U /H 1N 02828| /H N1/ /S 8| 1 04 gi|7774668 N Z eaala 80 lala 78|gb|ABB 93| /H|1/H 1/N BB ZZeeeeeaaaaala gi|777401507| gbb|A la / all eewwZ BB7995702531 N11//N e aallaa0r0Zoeeraala Neeeeww w ZZlZ gi|825 4717|g |Ab|CAaDn/2 7| /H1NN1 1/N /N wa aZZee/ 2K0wo KZe N eww 1 gi|82491096319N71|g / H111N /Taiw826485| /H 1 N 1/ /N / NZeeeww AthuetehwwZ gi|3 0| /H |gb|ABBDF955005869||| //HH H11NNNe1w//NNe/ USSo1uS/oN/N 1 / N/ 1Ne 5196 996 b|A BD9950 779| // HN11/1NNN11N |ABB 6895074|g b|A BD 9511931| | 1/H 1 1 N1N111NN1/ D B 98|gb gi|i|190057220930||ggbb|A 1H D9955372/6H | /H /H /HH1| / H 0818 g 9057 215 2| g b|A |AB D 11 6| 487| | | / 0/H / 1 gi|79 i| 057 30 0| g b|AABB1E1 8121188385080528961|241|| / H g EBE 1 0 59F885 96 gi|9905772235468| g| gb| A| BA E A 2 gi|i|905572 0849|g bgb| |AB|AB| AA|PABP59F4875 g 0 19 3 0| b b b b A BF gi|9i|9111 19 805794|5g1| g1 5|g0| bg|A b|AAB g i|9 1 12 01 72 2 0 48| g | gb| g i|91 12 47 8 7 9611 99| g g 91 4 3 1 8 0 4 8 gi| gi|7 g i||106872896953 gi |31 06 95 gi gi|1i|94 g gggii|i|73g1i| 83758371 6 87 g g i| gi i|91106 | uthuK alaa thZoee K r ao lr g i| gi|8905 1189 3 1 0 7 9 63 8 9 2 0 gi|gi|3172 3740131516| 6| g i| 10 87 8 4 4 | g gbgb 9 gi|9 11 1689 20 |g b8| | g b|A|A |A g i| 0 9 4 64 05 A b| BBEBF g 8 97 572 1 9 23 | g APAB D 1 8 g i| i|8 24 88 2 245|g b| | gbb|A59 8B595 1728 gi|9905 7294 69686|g b| gbA| BE|ABAP6503| 70178044|4| 11 F8 9 / H 0 | /H/ gi|1 111916 9|g|g b|| ABA / H1 8 8 06 07 b ABBDB 7 D9 1 5287541N| 1/HH11N B 7 80 51 | / H 7|| /H/ S1NN 1N1 1 gi|3g1i|31887964004| g|bA|A / o /N/ N/ N 8719 2003 | gb DB9E591908623|| 4/H4| 1/N |A 1 0 / H11NH11H/ N11eNNu11th/U ee 99|g | gb|A 1 w 1 / SKSoeAww b|AA ABPF8127|1/H N 1 / NNe1w N11/ U P5985988656| /H our /2ZZeZe /N/ZN w e A/2 51|5/H Zeeeaw / H1/N Ne1Sw/N 3|| /H athZ0eK0aa1llaaal Z e0 awalZ N 1/N 1N111N / ealaolr/ /S1o/ Seeoewa0wlal1Z 1/ al a 01 / Ze ew 1/ N N Targeting a distinct protein as vaccine candidate Choosing the protein Sequences of the human influenza A (covering the period 20002006) were taken from the NCBI Influenza Virus Resource database [2]. A multiple alignment with ClustalW [3] of the NA subtype N1 revealed it to be the most conserved gene (see NJ tree in figure 1). A consensus sequence based on 197 protein sequences was then chosen to be our vaccine candidate. Neighbour-joining phylogenetic trees were then built using MEGA 3.1 [4], in order to ascertain the degree of conservation of NA in different influenza subtypes. /H H 1/ 5N g on ng Ko / am N am / et Vi t N 1/ Vie / / am am 5N 1/ /H 5N et N t N 5| /H / Vi / Vie a m/ m / 71 16| 5N1 N1 ie t N Na 9777 | / H | / H5N1 /V 1/ Viet BEE9 1 8 20 /H5 5N |A B 97 7 27 16 | 9| /H gbb|ABE Z77 27 771 9| | gb| A |AAAAZBE9 556751|5g | g|gbbg| b|A 008857680595777| 300 09838908 |9 3i|20793230 ggii|i|9g7gi| 97E9 76611 6| / 7272 5N1/ iet N am V 7 1 7 H / 7 1 3 71 19 | /H 5N1 2| / H iet N | / 2| | /H 5N /Vi 5N 1 am / e H /H 5 1/ 5N 5 N V t N /Viet N 1/ N1 1/ V iet N am/ Vi /V ie a e t N iet t Na m / am N a m / m / / gi |4 788 3344 88992 4| | g gbb| |AAA ATT3 g i| 50 3990 29 08 81| 61 2| / H 59 gi |g / H 5N b| |4 5N 1/ AA 63 1/ Hon 60 gg T7 3 H 35ii||8875 3 27 on g K | /H 7| 0648 o g Ko ng gb 209 5 |A 57082 N1 /V ng AS | g| gb ie t 89 b|A|AB Na 00 BDC7 m/ 5| 1 26 /H 628 46| 5N 6| / H 1/ / H 5N1 T h 5N / T ai 1/ ha la nd T ha iland ila / / nd / / 3| d/ n d ilan 20 la ai ha 07 Th /T / AE 1 / N1 and |B il 5N 5 gb H /H Tha / | / 7| 6| / d 2 / 1 n 1 50 1 5 6 5N a ila iland 13 8 4 65 /H / Th ha 10 AT AS 6| N1 1/ T |7 m / m/ | A |A 900 / H5 5N Na gi t g b gb 8 6| | /H t Na 6| 0| S 61 37 Vie / A ie 5 9414 |AS 6 326 H5N1 N1/V 5 / 4378| g|bAA AV a m/ 2 1| 1| /H ie t N am/ 08 5 9g b b|A 76 1 772 N1 /V |546356| | g iet N ggi i6|8013830 BB E9 | /H5 5N 1/V Nam / 653799 5|g|bg| Ab|ATB7 33 29723| /H5N Viet 1 i|i|44642 A AW80 95 0| /H 1/ gggi|5 40102802583|g b|bA|A et N am/ AE 46 23 6 1650|g /H5N 1/Vi 8|g b| B 8| 4 57 8 32 96 gi|i|890269180 73 N1/Viet Nam/ 1| gb|AAT ggi|5g8i|716 BB76122| /H5 7|gb|A gi|50296 gi|93008583 02|gb| 24127|gb| ABE97722| /H5N1/Viet Nam/ gi|8 gi| g gi|723 82 419860 30 gggi gi i|5 AAZ7 2721 | /H5N1/Vie 20 |gb|A 2 9i|9 t Nam/ |i|9|9|88g0 i|9 BB 9|g 8 6 16 085723 76 5 2 2 4 3 3 020444130 0120|g b| AA b|ABE9 120| /H 5N1/V g 7 8 1 7 0 0 2 0 | 818222001i|7257139| g T7 33 3 /H 5N1 iet Nam/ 5 |A /Viet N 6505621175| | g3b98| gbb|A BB 0 | /H5 N am / |g3| g| ggb| |A6B09 BE976118 1 /V ie t Na b| bb|AAB B | gb 77 | /H m/ 5 AB|A B B 76 |AA 17| /H N1/V E B B 7 11 Z H5N1 subtype a 2/ 3/ / al 00 00 3 / Ze / 2 2 00 3 ew SA SAA/ /22003/ / N / 1 / U / UU SSA 00 03/ / 1N 1N1 NN111/ /USA/2/ 2000033// /H H /HH111N/U SSAA/2/ 200033/ | / 2 71 9| 0|| / /HN 11/ U /U SAA/ 00 03/ 16 21 25176/|H11NN11//UUSSA/2/ 20 03/ 82 442107120| | |/H/HH111NN11/U/ USAA/ 200033// E1 B B 8 N S A 8 / 8 1 H C 6 A/2 00 1 U / 7 A AB B B0 2102 65|| /H 1N 1/ SS0A03/2/ 02/ 3/ b| /U/2 A/20 /200/ b| b|AADB68023 8709 8| | /H 11NNN11/U |g A003 /H | g | g| g|AbB|BBBBB0128217431209|| /H S1/U S/2 1/US1A/U 94 /H 0 1 6 A 7 B | 7 N A 9 9 1 | N A 8 43 2bb ||A BBA12264|7/H 8 1 N| /H 11/U SA3/ 8 AA190880 56| /0H610/H 00 03/ 1N 67 76178061175||g||ggggbbbb|A |ABB S A/2A 11 0 | A 7 2 U 3 4 8 /2 |A 8 3 1/ 6 B 1 8 2 0 | g 1 1 |A A 8 6 b 2 S 5 1N Z 5 8| /H 1N1/U g Ko ng A AZ || g| bbA|A |9 25 73520174892461760527 bB|ADB114 |A |g |A 33||bgg9 3bA |g 1/Hon gi |8 8736214676618272113 0b| Kong/ 106 741|3|/H/H BBB03 1N1/ ng/20 2057 5326 Ho g|g 498 BD57 gi |8g7g8i9i|2575|187572714161476 01 638 5|3 147 1N1/U 0 746 b||A 46 /H1N SAA/2 7 |g 2| 6 i| 3 5 i 4 7 CA / | 0 26 3 | 8 i g 77 7 001 1 /H i| gb 57 | 7 i| 4 ggi|i|8 gi|709796 41gb 3||CAD US gg 8522 /H1N1/ A0423 27| /H1N |AB 1041 gggi 1/USA/2001/ |gb BA |3 b|A 915| 914| 96 9|g 72 BD60 106 |gb|A 47 /H1N1/New Zeala 641 12399 |3 7410 5210| gi| gi|891 gi|90572359|gb|ABD9 gigi| gi|7 gi|9057245 gi|10 gi|3 gi 77 4|gb|ABD9 6896 109 5265| |9 7419 328| 639 /H1N1/New 11 70 gb|A 3|g Zeala BF8 78 b| ggggigig CA 2822 |gb D57 01 | 7 /H1N | g 9 AB 5 251 3| 1/Ne i|8|8i||i|8i|88i|7 i|9 /H1 N1/Swedew Zeal 8910011557 0741375327407|ggb |AB0 27|| /H BE29 n/20 11 190937418|g 997979979 1 b|A 1 1122825 BA 2|1N /H1/U 87068 01 |A 33 7 69 |g7b|g |AbB 1NSAN/20 887588161002132 |A DB9D ew / al |A BD 685026975317835586364281235|||gg|ggbbbb|A 49758031| /H 1N11//U D |A B a | /H B SA/2Ze |A D9995449996589|| 2/H 1N 1/N B 1215| 30360|||||ggggbbb|||A 1 B E 002/ N B A 1 /H D 1 0 e 0 g| b| gggbbbb|||AA D66016623|| / H 1N1 /N ew w Z g|bAb |A|AAABBBBBDDD 0 8 1 /N / 0 6 N H1N 1/N ew Z eaelaala |AB|ADBBBDDDD7666310588987210| ||/ //H Z 1 1 e e /N e w Z ala BD7 D 77770790621|| / HHH11NN11/N 77977777 10136| / H 1NN 1 / N eww Z eeaala /N eew Zea la 7391923232|||///HH11N11/N la 4|2/H0| | / H/ HH11NNN 1// Neeww ZZeeala N 1 1 / | /1N H 1NN 1/ /N eww ZZ eaala H11 /1N 11/NNeew Z ea la NNe 1/ N/N eeww ZZ eeal la 1/w e w Z e al a N Ze w Z e alaa ew a Z ea al a Zlae eal la al a a la / a a la 000 al Z eea la la a A/ 201/ Ze w Z eaZ e ala a e w Z l e / US /20 ew/ N e w w Z a a N1 SA N 1 /N e Ne w Z e al la /H 1 1/U 1 /1NN 11/N1/ /N eew Z e Z ea 33/|H1N 1N/ H 1 N 1NN 1 /N ew eweala 5650| /H | / HH1/ H 1 N 1 / N /N Z ala la e 1 | w 6 1 H A8X624 5 | 808 | /4|| // H11N1N/N e w Z Z eaala 3 b|A BC 9 77890 1590 593| | / H| / HN 11/ Ne/ Neeww Z e | g|A 1 7 7 7778D09607895543001/ H11N1 /N N 684|5gb DBBDDDB 1N 1 D761 956441| |9/7H|| //H 970172 BD AB||AA bB|||AA 079030 H i|57805 ABB|ADDB77D7897 bg| bbb||gA bBBDBD gi|g8 gbb|A|A |2g|||gg56 D78 049|9| gg8b|bg|A b|A |AB 9 60928056 658147262238163960|7| g07| g 8869 80775197181081856377748875065 9977879g 899ii|9|879 |88ggii|i||8g g ii|ggi| g g gi i|9 gi| i|31|897057 gi|318 87388 241 g 1 7 7 g i|9 06 20 09 37 6| g i| i|90 49589 131| g| ggb gi gi 8 97 572 95651| gbb|Ab|A|AB gi|9|905|7744789 0047376| g8| g|AAAPBDD9 75 0 7 3 | bb gi| 572232 23 |g bgb| |A|APB55988024 g 106 2261| g2|| gb| ABADBBDF4F988565853| AB|AB7 8 9575297| | /| / / H gi|9i|94958965| gbb|A 0 D gi|900572972837| g BD 95A089627863|2/|HH11H11N 4 b|A95118848| /H6| // H/ HNNN 1 1/N gi|9 gi|73 5716 92| g| gbb|A NH111N11N//S1S/oN |ABBFB4F8323|| //9H|11/H | g |A gi|764495957567559385|g NNe11/1N/ouueetww B D95767943H1N11/N |gb|A bb|A gi|73746825|g /N eNwethh ZZ 1ew 1//N 61563 b|ABBF47AZD89547298274|| //|H/N H / eew wKKee U N 1 1 e wZSZ 1 H 5 |gb|AAA431 75| 29|| / H1NNN a Z oa gi|31872 1w /H1N/H11N111//U 92| /H Z8330 ZAee/l2aaZ0eZeaoaerarlallaa 017| gb N/N Swe Ne1/U / gi|31096 1 2 e 1 N Aw | /N N a00laal /H |A e 1 /2aZlla A 1 P59859|N 1/U w Sew 1| gb|CAD gi|106895941 S /U Z A A a/ /2 S /2Z0e00a0ela a A 58 /H 0 57 la 0 1N 260| /H |ABF82665 0 gi|90572 511|gb||gb 01th 1//2 / 11//l S 1N ou ABD | /H1 1/ 952 Fiw N1 Kd/ nlan /H1N gi|90571654|gb| ABD948 98| or 1/Ne w /Ne Ze Zea 03| al2 la /H1N1/N ew Zeala gi| 75213002| gb| ABA18040| /H1N1/USA/2001/ Kor outh Kor /H1N1/S | uth al AP59861 1/So Ze ala Ze gi|31872386|gb|A 846|| /H1N /Ne / ww/2 1/2 b|AAP59 Z ea /H1 1/ Ne 0a0rkla 1NN1 N ew gi|31871989|g 0| /H m SeAnu o rl |ABF8 3276 9526 1/U /H1N 5977|gb N1/ tothhrZeKeaaalalaorl b|ABD oKw gi|10689 94814| 1| /H 11NN111/D /S 2549|ggb |ABBD /Sthoeu ZZ0e1K/eaal1laao/ r gi|905771 11/N A452722655| |/H 673| /H |A SNo1u/Neww20 a b h Z e D 3 t |g 05 0 1 /H A 5 |9 6 e gi 636605 1|gbiw 1 11/N |C n/259886/2H| 1N /N Ae/ oeuww /ZZ2t0he0aKl P 595 5| 99| / H11N 0 a |AaA N 1U/N S gi|7 10964N S/ SNeewAouZ b AP || /H 1 19/T 1N //HHN SSw 111// U 59F882863 gi|3 NN111///NN 6| /H 3 808|g 5 gb| A|AAPB U | 5 / 2 2 | e N 1 / 5 9 1 7 1 2 1 N H 8 4 1 / H1 1 1N B5 gi|31 87230 9|g b| gb|ABE195107| //H b|AB D 78809| 4H93||| //HH11NN1N1/ 0 85|g gi|i|33118 72964601| gbb|A |AB D 2223592889337|| //H| /H1N 0816 g 068 08 4| g b|AB C042P 8 667 0H gi|79 gi|191127522679| gb|AABBA|AAABEF11520199855| / gi|i|905788423||ggb5| | ggbb||ABBDB98P5 85 g |89 42 17 99 7| gb |A B A 82 gi 331660 71 6342| | gbb|Ab|ABF 3 8 gi|i8|763i|31 6891901125| g7| gb|A g g i|10 11 72 60 99 | g g i|9 05 42 71 85 g i|9 25 8 63 g i|8 i|3189 g g 6 0 |1 gi 712900 g i ggi|i| g 548312639 91|9g31| g |9 7735i| 738469 | gbb||A gbb|A 05 767g13i71 g086|2| gg AP A|A 71 645|1770319 i|7g9bbb||AA|AA PA 7 59P BBAAP 5598 96 096506263 0|A 5 88579 8498452|| /H 0| 9|g|g49b89| 81B8C gi g b | gb| |8A5g 2407203896447|| /8 |8 H 1/H 11N1 37 | A bA| AB|Abg|C 0| 5g364|| /H/H N 1 / So BD AB Z71b|AD b| | //H 11NN11/N 44 So/ So uth 1N 1 / S 9 4 B9026A 72AZ57 ABH 84 1 / U ouut h uKth KKor 1/U B S th o o 9 7 207 1| 3825 5N 0| 1 1 9 | / 4 9 | 1 /HH 63 | 95/U SSAA/ 20Ko r r gb 1 / 6 H /H | 1 N 0 8 A /20 0 r |A BC 1N /HN1 /1U/ |U/ H1N1| /H/1200011/ / 1 S 1 1 / M 42 /N N1SA A/ N 1 o N11/ ew /U /2 2 0 /U ro /T a 75 0 3| Z S 0 0 1/ SAcc o iwa /20 /2 n/ / H e al A/21 / 20 a 0 0 1 N1 subtypes am / 0.02 Figure 2. Neighbour Joining tree of 222 NA sequences (N1 subtype). The 25 H5N1 sequences forming the distinct lower clade were removed from the dataset before we generated the consensus sequence. Analysing the protein Class I MHC epitopes for all supertypes were found using NetCTL [5]. Epitopes for HLA-DR4 (an MHC II allele) were found using EasyGibbs [6]. A B-cell epitope was identified using BepiPred [7] using an HMM model and provides the residue scores as log odds. Table 2 Three highest scoring epitopes for HLA-DR4 MHC class-II. Predictions are made with EasyGibbs [6]. All identified epitopes are in highly conserved regions of NA. Epitope location 155 239 100 Epitope EasyGibbs score YRALMSCPL 19.381 FTIMTDGPS 17.788 YTKDNSIRI 16.758 Table 1 Shown below are the best scoring predictions of epitopes in NA subtype N1 obtained with NetCTL. Predictions are made for MHC-1 binding (Affinity column), proteasome cleavage (C-terminal cleavage column) and TAP translocation. All 3 measures are combined to a single score (combined score) in which the listed epitopes were the highest scoring for the respective HLA supertypes. Epitope location column lists the position of the first amino acid in epitopes placement in the NA protein sequence Epitope HLA location supertype 304 22 254 132 150 100 151 276 182 274 A1 A2 A3 A24 B7 B8 B27 B44 B58 B62 Epitopes VSFNQNLDY LMLQIGNII KIFKIEKGK FFLTQGALL KDRSPYRAL YTKDNSIRI DRSPYRALM YEECSCYPD SACHDGMGW HQNEQGSGY B-Cell epitopes With BepiPred we predicted the region 322 to 348 to be the best region for a linear B-cell epitope. NetCTL predictions Affinity CTAP Combined terminal binding score cleavage 0.5107 0.9953 3.0830 3.7221 0.4439 0.9847 0.7320 1.1974 0.5628 0.6765 0.7600 1.7818 0.7806 0.9461 1.0550 1.2154 0.3923 0.9796 0.8500 1.2284 0.2214 0.9986 0.5000 1.1964 0.1291 0.9990 0.2920 0.7520 0.1431 0.0124 -2.1380 0.8838 0.6493 0.4893 1.1530 2.3525 0.3828 0.9083 3.1650 1.2279 Figure 4 Zoom of the predicted epitope area (red) highest scoring 9mer is coloured yellow (both figure 3 and 4 were generated in PyMOL [8] with 1V0Z.pdb) FGDNPRPKDGEGSCNPVTVDGANGVKG The region in bold is the best scoring 9mer epitope of the region (BepiPred score for all 9 residues: 19.788). Figure 3 and 4 show 3D visualisations and figure 5 a logo of the epitope. Creating a plasmid vaccine Sequences for all influenza proteins were taken from NCBI Influenza Virus Resource database [2] (only genes sequenced in the period 2000-2006 were used). Multiple alignments were made using ClustalW [3], giving consensus sequences for all proteins. Using NetCTL [5] and EasyGibbs [6], respectively, epitopes for all MHC-I supertypes and HLA-DR4 were found. Polytope of all epitopes were constructed with triple A linker regions binding the epitopes together. The polytope was optimised with a Monte Carlo Metropolis simulation (MCM) implemented in polytope_cont3 (unpublished). MCM settings: all standard but with 700 iterations and 14 temperature steps. The final polytope was evaluated with NetCTL to check C-terminal cleavage, TAP translocation and affinity Results For the final polytope we replaced two epitopes (HLA: A3 and B44) with suboptimal epitopes (both ranked 2nd in the NetCTL prediction) because these had better C-terminal cleavage than the best predicted. Results T-Cell epitopes Listed in the table 1 are all NetCTL prediction of good MHC-I epitope candidates in NA. Table 2 shows MHC-II epitopes. All of these are in highly conserved regions of NA sequences covering 2000 to 2006. Figure 1. Structure of an influenza A virus. Polymerase B1, B2 and A proteins (PB1 + PB2 + PA), Hemagglutinin (HA), Nucleocapsid protein (NP), Neuraminidase (NA), Matrix proteins (M1 + M2), Non-structural protein (NS) All epitopes were validated to be in highly conserved regions of the genes. polytope_cont3 only predicted three C-terminal cleavages within epitopes (HLA: B7, A3 and B58) and no additional epitopes would arise from the polytope. Given the stochastic nature of the proteasome we do not consider these few internal cleavage sites to be a problem. Table 3 Shown below is the polytope (top row to bottom row of the epitope column). Further shown for each epitope is the location in gene and relative position (Epitope location), HLA supertype for the epitope and the validated rank of the epitope in NetCTL. DR4 is an MHC-II allele and thus not validated against NetCTL. Epitope location PB1 (540-548) PB1 (482-490) PA (495-503) PB2 (523-531) HA (224-232) PA (46-54) PA (126-134) PB2 (529-537) PA (140-148) HA (377-385) HA (369-377) HLA Epitopes (upper case) and supertype linker regions (lower case) maaeyGPATAQMAL B7 SYINRTGTFaavdd A24 RRKTNLYGFiryq B8 GTEKLTITYkrw A1 RRFTPEIAKa B27 FMYSDFHFInlwcek A2 EVHIYYLEKc A3 ITYSSSMMWavdy B58 SEKTHIHIF B44 YAADQKSTQwhyrw DR4 HQNEQGSGYaa B62 Rank in NetCTL prediction 1 1 1 1 1 1 1 2 1 NA 1 Are these epitopes novel? Figure 3 Tetramer structure of neuraminidase viewed from above (white lines show the 4 proteins). Nonyellow coloured residues are part of highly conserved sequence regions Figure 5 Logo [9] of the 9mer B-Cell epitope from figure 4 (across all 197 sequences). All positions are highly conserved although position 5 and 8 indicate presence of mutation and risk of immune escape. Discussion Conclusion We limited our work to influenza A virus and subtype N1 as we believe vaccines for influenza will be most efficient when targeted against a single subtype rather then multiple subtypes. The purpose of this approach is to keep the epitopes specific rather than general in order to achieve a focused immune response. The approach we made here can easily be repeated for other subtypes. We are aware that immune escape does happen. Any vaccine against influenza will have to be updated often and may also further be different based on specific regions in the world [11]. Question remains whether the approach here will be easier than the current vaccines based on heat-killed strains. The vaccines we have designed are based on a dataset for the period 2000 to 2006. As can be seen in figure 1, the N1 subtype sequences show a very high degree of similarity. This does indicate that epitope based vaccines may be useful for longer periods than the current methods which are updated every year [11]. The bioinformatical designed vaccines are in no way final, in vitro and in vivo tests are necessary to determine the real effect of the vaccines. The benefit of the bioinformatical approach is that the preliminary design is very cost efficient compared to standard laboratory trial and error approach to vaccine designs. References Figures: Figure 1. Image copyright by Dr. Markus Eickmann, Institute for Virology, Marburg, Germany. http://www.biografix.de Litterature, webresources and tools: [1] CDC - Influenza (Flu) | What Everyone Should Know About Flu and the Flu Vaccine (http://www.cdc.gov/flu/keyfacts.htm) [2] NCBI Influenza Virus Resource (http://www.ncbi.nlm.nih.gov/genomes/FLU/) [3] ClustalW: Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence We searched for all identified epitopes in the SYFPEITHI database [10] and found no matches, however in general SYFPEITHI seems to be lacking epitopes for influenza. Most vaccines against influenza are based on heatkilled influenza strains and not epitopes. We have used bioinformatical tools to identify MHC-I and II supertype T- and B-cell epitope candidates in the highly conserved NA protein. In addition we constructed a polytope of highly conserved epitopes found in the full genome of the human Influenza A virus subtype N1. The methods used can easily be applied to other subtypes. Virus subtypes should be dealt with individually in order to make the vaccines specific and effective. weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Research, submitted, June 1994. [4] Mega 3, Version 3.1. S Kumar, K Tamura, and M Nei (2004) MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Briefings in Bioinformatics 5:150-163. [5] An integrative approach to CTL epitope prediction. A combined algorithm integrating MHC-I binding, TAP transport efficiency, and proteasomal cleavage predictions. Larsen M.V., Lundegaard C., Kasper Lamberth, Buus S,. Brunak S., Lund O., and Nielsen M. European Journal of Immunology. 35(8): 2295-303. 2005 (http://www.cbs.dtu.dk/services/NetCTL/) [6] Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach. Nielsen M, Lundegaard C, Worning P, Hvid CS, Lamberth K, Buus S, Brunak S, Lund O. Bioinformatics. 2004 20:1388-97 (http://www.cbs.dtu.dk/biotools/EasyGibbs/) [7] Improved method for predicting linear B-cell epitopes Jens Erik Pontoppidan Larsen, Ole Lund and Morten Nielsen Immunome Research 2:2, 2006. (http://www.cbs.dtu.dk/services/BepiPred/) [8] PyMOL version 0.99 (http://pymol.sourceforge.net/) [9] Crooks GE, Hon G, Chandonia JM, Brenner SE WebLogo: A sequence logo generator, Genome Research, 14:1188-1190, (2004) (http://weblogo.berkeley.edu/) [10] Rammensee, Friede, Stevanovic, MHC ligands and peptide motifs: 1st listing, Immunogenetics 41, 178-228, 1995 (http://www.syfpeithi.de/) [11] The Influenza Sequence Database (http://www.flu.lanl.gov/)