HEP_23101_sm_SupplMaterial

advertisement
1
SUPPLEMENTARY MATERIAL
Consensus differences and genetic distances between genotypes 1a and 3a
Overall, the consensus amino acid sequence differed between HCV 1a and 3a genotypes at 515
of 2,147 residues examined (24%; Supplementary Table 1). Of the consensus amino acids that
differ between the genotypes, 350 (68%) involved a 1-nucleotide substitution, with a consistent
distribution across the non-structural proteins (Supplementary Table 1). For these sites where the
genetic barrier (number of substitution steps) between HCV genotypes is low, we found that the
amino acid variation of one genotype included the alternative consensus at 166 residues (47% of
1-nucleotide distant sites). For the remaining sites (n=184, 54%), the amino acid variation of one
genotype did not contain the consensus sequence of the alternative genotype despite a low
genetic barrier. Of these sites, amino acid substitutions were physicochemically conservative in
62% of cases indicating that factors other than genetic distance serve to preserve distinct intergenotype differences at many of these sites. For the 165 (32%) residues where there was a higher
genetic barrier between consensus amino acids, there were only two sites at which the alternative
consensus amino acid was also represented as an intra-genotype variant. Otherwise, these
genetically divergent sites defined highly genotype-specific mutational pathways.
Consensus differences and proteasomal cleavage predictions
Given that the processing of viral antigens relies on relatively monomorphic proteins to direct
appropriate peptide cleavage, and that abrogation of this antigen processing pathway provides a
means of evading CTL responses through ‘processing escape’ (1;2), we next examined the
effects of genotype-specific differences in consensus sequences on antigen processing using the
NetChop prediction tool (3;4). This analysis identified 102 sites (20% of all discordant consensus
residues) where consensus sequence variation was predicted to abrogate proteasomal cleavage
(55 for genotype 1a; 47 for genotype 3a), as shown in Supplementary Table 1. The large number
of different consensus amino acids and predicted processing sites between the genotypes is likely
to result in differences in the repertoire of HLA-restricted viral epitopes across the genomes.
2
Analyses of positive and negative codon selection and amino acid co-variation
Analysis of positive and negative selection for the HCV non-structural protein sequences utilised
the Single-Likelihood Ancestor Counting algorithm implemented in the program HyPhy,
consistent with the approach adopted by Campo et al (5). in which selection of HCV genotype
1b sequences was considered. The assessment of amino acid co-variation at sites of HLAassociated polymorphism utilised Fisher’s exact tests for classification as consensus vs nonconsensus amino acid using S-Plus 8.0 (Insightful Corporation, Seattle, USA). Here, sites at
which amino acid polymorphism was significantly associated with the presence of each HLAassociated polymorphism described in Table 2 (cut-off p<0.01) were identified.
Evidence of positive and negative selection in HCV genotypes 1a and 3a
We also examined these HCV sequences for evidence of positive and negative selection at each
residue position, using the Single-Likelihood Ancestor Counting algorithm implemented in the
program HyPhy (6). In keeping with previous studies (5), we identified a dominant overall
pattern of negative selection implying a relative abundance of synonymous codon substitutions,
suggesting that extensive amino acid variation is not tolerated. Overall, only 30 of the 2,147
codon sites analyzed (1.4%) showed evidence of positive selection in genotype 1a (listed in
Supplementary Table 2) compared with 1,415 sites under negative selection (66%) across the
non-structural HCV proteins. Similar results were obtained for genotype 3a, where evidence of
positive selection was identified at only 13 sites (0.6%) while negative selection was evident at
1,395 sites (65%) (Supplementary Table 2). Interestingly, only five positively selected sites were
common to both genotypes 1a and 3a.
3
Codon selection and co-variation at HLA-associated sites
There were a number of HLA-associated polymorphic sites without evidence of co-variation,
suggesting that the selection of amino acid variation is primarily determined by site-specific
characteristics. These putative ‘independent’ sites accounted for the majority of HCV genotype
3a associations (13/18 sites = 72%) but were less common for genotype 1a (12/32 sites = 37.5%).
For genotype 1a associations of this type, it is notable that a number of sites are characterized by
negative codon selection (eg. NS3-1398 and 1403 (HLA-B*0801); NS3-1495 and 1503 (HLAA*0101); NS2-841 (HLA-C*0401) and 851 (HLA-C*1502)), implying a locally-determined
fitness cost of mutation that is in keeping with experimental data (7). This was not as apparent
for ‘independent’ HLA-associated polymorphisms within genotype 3a, which tended (with the
exception of NS3-1133) to demonstrate neutral selection. As shown in Supplementary Table 3,
there were also a number of HLA-associated polymorphisms that did appear to exist within
highly integrated networks, involving amino acid co-variation throughout the non-structural
HCV proteins. These ‘integrated’ sites included several examples in which amino acid covariation linked to an HLA-associated polymorphism was also identified as a ‘primary’ HLAassociation site. This could in some cases be attributed to a common HLA-restriction (eg. linked
amino acid variation at NS2 residues 958 and 1006, as well as NS4b residue 1723, each
associated with HLA-B*3701; and co-varying NS5b polymorphism at positions 2841 and 2846
associated with HLA-B*2705), where it is conceivable that the selection of a primary CTL
escape mutation (eg. at residue 958, as shown in Figure 2) could entrain compensatory mutations
elsewhere that preserve viral fitness. However, we also noted several links between ‘integrated’
sites associated with distinct HLA alleles – for example, NS2-957 polymorphism (HLA-B*1302)
is associated with co-variation at NS4a residue 1695, which is in turn a HLA-B*2705-asociated
site. Each of these examples, involving seven polymorphic sites in genotype 1a, is highlighted in
bold in Supplementary Table 3.
4
Reference List
1. Kimura Y,Gushima T,Rawale S,Kaumaya P,Walker CM.Escape mutations alter
proteasome processing of major histocompatibility complex class I-restricted epitopes in
persistent hepatitis C virus infection.J Virol 2005;79(8):4870-4876.
2. Seifert U,Liermann H,Racanelli V,Halenius A,Wiese M,Wedemeyer H et al.Hepatitis C
virus mutation affects proteasomal epitope processing.J Clin Invest 2004;114(2):250-259.
3. Nielsen M,Lundegaard C,Lund O,Kesmir C.The role of the proteasome in generating
cytotoxic T-cell epitopes: insights obtained from improved predictions of proteasomal
cleavage.Immunogenetics 2005;57(1-2):33-41.
4. Saxova P,Buus S,Brunak S,Kesmir C.Predicting proteasomal cleavage sites: a comparison
of available methods.Int Immunol 2003;15(7):781-787.
5. Campo DS,Dimitrova Z,Mitchell RJ,Lara J,Khudyakov Y.Coordinated evolution of the
hepatitis C virus.Proc Natl Acad Sci U S A 2008;105(28):9685-9690.
6. Pond SL,Frost SD,Muse SV.HyPhy: hypothesis testing using phylogenies.Bioinformatics
2005;21(5):676-679.
7. Salloum S,Oniangue-Ndza C,Neumann-Haefelin C,Hudson L,Giugliano S,aus dem SM et
al.Escape from HLA-B*08-restricted CD8 T cells by hepatitis C virus is associated with
fitness costs.J Virol 2008;82(23):11803-11812.
8. Kumar S,Tamura K,Nei M.MEGA3: Integrated software for Molecular Evolutionary
Genetics Analysis and sequence alignment.Brief Bioinform 2004;5(2):150-163.
5
Supplementary Figure legends
Supplementary Figure 1. HLA Class I allele distribution across genotypes (A) and cohorts
(B).
Supplementary Figure 2. Polymorphism profile and correlation of polymorphism rates in
non-structural HCV proteins. Polymorphism profile in the non-structural HCV proteins (left
panels). Vertical bars indicate the proportion of sequences with non-consensus residues for
genotype 1a above the line and genotype 3a below the line. A red circle along the x-axis and red
bars indicate residues with a different consensus amino acid for the genotypes. Correlation of
polymorphism rates between the genotypes (right panels). Black dots indicate residues with
identical consensus for both genotypes, red dots indicate residues where the consensus differs
between the genotypes (see also legend to Figure 3).
Supplementary Figure 3. Comparison of genotype 1a and 3a HCV NS3 and NS5a
sequences in the Australian, Swiss and United Kingdom study cohorts.
A phylogenetic analysis of NS3 and NS5B was generated using the amino acid alignments
described above using the Neighbor-Joining method based on the p-distance model with pairwise
deletion. Mean genetic distance between the genotypes was calculated using the same
alignments. All analyses were performed using Mega v3.1 (8).
6
Supplementary Table 1. Consensus differences and genetic distances for genotype 1a and 3a
non-structural HCV proteins (NS2-NS5B).
Protein
Amino Different
acid
consensus
between
genotypes
1 nucleotide
change
between
genotypes
>1 nucleotide
change
between
genotypes
NS2
217
92 (42%)
59 (27%)
33 (15%)
Genotype
differences with
loss of
proteasomal
cleavage
30 (14%)
NS3
631
114 (18%)
81 (13%)
33 (5%)
33 (5%)
NS4A
54
11 (18%)
10 (18%)
1 (2%)
3 (5%)
NS4B
261
52 (20%)
39 (15%)
13 (5%)
11 (4%)
NS5A
448
124 (28%)
80 (18%)
45 (10%)
6 (1%)
NS5B
536
122 (23%)
81 (15%)
41 (8%)
19 (4%)
Supplementary Table 2. Codon sites with evidence of positive selection.
Genotype 1a
Protein
Site
814, 824, 837,
Genotype 3a
Protein
Site
856, 859, 896,
NS2
904, 921, 925,
NS2
824, 849
NS3
1444, 1503, 1607
NS4A
NS4B
Nil
1873
NS5A
1979, 2263, 2307
998, 1019
NS3
NS4A
NS4B
1411, 1582,
1634, 1640
Nil
1873
2079, 2107,
NS5A
2197, 2281,
2319, 2374
2486, 2510, 2534
NS5B
2537, 2540, 2600
2626, 2729
NS5B
2486, 2497
2534, 2626
Bold indicates sites under positive selection for both genotypes 1a and 3a.
7
Supplementary Table 3A. Codon selection and co-variation at HLA-associated sites (GT1a).
HLA association
Protein
Site
HLA
P
NS2
824
C*1502
NS2
841N
C*0401
N
NS2
851
C*1502
NS2
-
Co-varying amino acid residues
NS3
NS4a
NS4b
NS5a
1093N,
NS2
856N
B*3503
814P,963
1223N
2079P, 2217,
-
1967
1344N
NS2
957
NS2
958
NS2
NS2
962
998P
NS2
1006
NS2
1017N
B*1302
C*0602
B*3701
C*0602
B*3503
B*1302
B*3701
C*0602
B*1501
-
2252, 2268,
2493N
2362, 2373
834N, 846N
883,904
NS5b
-
1695
941,969
1968N
1969N
2079P, 2102,
2287, 2298,
2369, 2373
2435N,
2475N
1006
-
-
-
-
-
885,925P
-
-
-
-
2586N
958
-
-
1723
2330
-
814P
-
-
1946
-
2501
-
-
-
-
-
-
-
2169
-
1106
1686
1969N
2093, 2102,
2435N,
2369
2475N
2008N, 2169,
2485,
2171N, 2185,
2604,
2213, 2339N,
2674,
2362
2747
-
1093N,
NS3
1341
A*1101
821,963
1223N,
965
1344N,
1408N
NS3
NS3
NS3
NS3
NS3
NS3
NS3
NS3
1366
1368
1398N
1403N
1444N
1495N
1503N
1635
C*1502
B*5101
B*0801
B*0801
A*0101
A*0101
C*1203
A*1101
NS4a
1695N
B*2705
846N,883
957,969
834N,879
883,885,
NS4b
1759*
B*3701
904,
1066, 1200
1694N
1964
938,941
1006,1018
NS4b
NS5a
NS5a
NS5a
1876N
2000N
2036N
2155
B*4001
C*0401
A*1402
B*3501
852N
941
-
1444N
-
-
1873P
-
2075N
2324N
NS5a
2227
B*4403
-
-
-
-
2216, 2234
-
NS5a
2234
B*5101
-
-
-
-
2227
-
8
NS5b
2467
B*1501
941
-
-
1964
2375
NS5b
NS5b
2510P
2796
A*3101
C*0303
906N
-
-
-
-
-
NS5b
2841*
B*2705
-
-
-
-
-
NS5b
2846*
B*2705
-
-
-
-
-
2630,
2747
2844N,
2846
2841
Supplementary Table 3B. Codon selection and co-variation at HLA-associated sites (GT3a).
HLA association
Protein
Site
HLA
NS2
981
B*4403
NS3
1073
A*2402
NS3
1133N
A*0301
NS3
1383
B*5101
NS3
1416N
B*0702
NS3
1444
A*0101
NS3
1560N
A*2402
A*1101
NS3
1637
B*4403
NS3
1646
A*0101
NS4b
1759
B*5701
NS5a
1982
B*5701
NS5a
2248
B*3501
A*0201
NS5a
2320
C*1502
NS5a
2321
C*0602
NS5a
2354
B*3501
NS5a
2367
B*5101
NS5a
2372
A*1101
NS5b
2467
B*1501
-
Co-varying amino acid residues
NS3
NS4a
NS4b
NS5a
2219
1607P
1570N
-
-
1074N
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
NS2
983
-
NS5b
-
Bold=HLA-associated site; *Association in unrestricted dataset only; Ppositively and Nnegatively selected
sites
Download