1
HIV:
Many failed attempts have been made to develop vaccines against HIV.
This is due to the rapid mutation rate which enables the virus to evade immune recognition.
HBV:
The HBV genome has overlapping CDS’s.
Analyzing the implications of mutations affecting overlapping genes (with regard to HBV’s evolution and its interaction with the immune system) may also help us learn about similar viruses with overlapping genes
(e.g. HPV).
Our goal is to characterize changes and viral trade-off preferences in the epitope distribution along the CDS of different proteins.
2
An epitope is the part of the antigen that is recognized by the immune system.
Proteasome – degrades proteins within the cell into peptides about 9 AA long.
TAP – delivers cytosolic peptides into the ER, where they bind to MHC class I molecules.
MHC-I – found in every nucleated cell, function being to display fragments of proteins from within the cell to T-cells.
3
HIV attacks CD4+ cells. The spikes on the surface of the virus particle stick to the CD4 and allow the viral envelope to fuse with the cell membrane.
Leaving the envelope behind after fusion, the viral RT converts its RNA genome into DNA. It is then transported to the cell nucleus, and is spliced and integrated into the human genome by the viral integrase.
HIV provirus may lie dormant within a cell for a long time. When the cell becomes activated, human as well as HIV genes are transcribed using human enzymes.
Then the messenger RNA is transported outside the nucleus, and is used as a blueprint for translation and replication.
4
Tat – controls transactivation of all HIV proteins.
Regulatory
Proteins
Rev – The differential regulator of expression of virus protein genes.
Nef – negative regulator factor, retards HIV replication.
Accessory
Proteins
Vif – infectivity factor gene.
Vpr – undetermined function.
Vpu – required for efficient viral replication and release.
Structural
Proteins
GAG – codes for various proteins necessary to protect the virus.
Has 3 parts: MA (matrix), CA (capsid), and NC (nucleocapsid).
POL – codes for the enzyme necessary for virus replication.
Has 3 parts: PR (protease), IN (endonuclease), and RT (reverse transcriptase).
ENV – the envelope of the virus. Has two parts: SU (surface envelope, gp120) and TM (transmembrane envelope, gp41).
5
HBV is a small enveloped virus with partially double-stranded circular DNA genome. It is the only member of the hepadnaviridae family that infects human.
The HBV genome contains 4 main genes:
Core – encodes for the capsid protein.
Pol – encodes for a polymerase, with reverse transcriptase activity.
Surface – encodes for small, medium and large ER intermembrane proteins.
X – thought to have transcription regulation activity.
The HBV genome has 4
ORF’s – the entire Surface protein, the C-terminus of Core and the N-terminus of X overlap with Polymerase.
6
Previous works have shown that HIV tends to decrease the number of epitopes in regulatory proteins, which predominate in the initial stages of replication.
On the other hand, in HBV, the protein copy number more than the expression time seems to affect the epitope density.
7
The advantage in a mutation that removes an epitope is usually lost when the virus transfers to a new host with different HLA alleles.
Therefore, we expect a high turnover of mutations in potential epitopes in the new host during the transfer.
Mutations affecting the cleavage sites (flanking regions) are not dependent on the HLA allele and will therefore provide the virus with this advantage, also in the new host.
8
Multiple Sequence Alignment
Phylogenetic Tree
PreProcessing of Sequences
DNA-based Mutation Positioning Within the AA Sequences
Translation of DNA Sequences
Peptibase
Mutation Characterization for all Alleles
9
The input DNA sequences are aligned using MUSCLE 3.6.
The sequences were retrieved from the LANL HIV Database.
MSA
Phylogenetic Tree
PreProcessing of Sequences
Mutation Positioning
Translation
Peptibase
Mutation Characterization
A genetically distant ‘Outgroup’ sequence is added to properly position the root of the tree and reconstruct the ancestral sequences.
The ‘Outgroup’ sequence for the HIV dataset was selected from SIV.
10
MSA
Phylogenetic Tree
PreProcessing of Sequences
Mutation Positioning
Translation
Peptibase
Mutation Characterization
The alignment is used to build a phylogenetic tree using the Maximum
Parsimony method (Phylip
3.69).
The intermediate sequences built by the program reflect the changes that occurred within the coding sequence of the viral protein.
The phylogenetic tree shows the epitope development of the virus.
11
MSA
Phylogenetic Tree
PreProcessing of Sequences
Mutation Positioning
Translation
Peptibase
Mutation Characterization
The sequences reconstructed by the Phylip program may contain ambiguous nucleotides.
These nucleotides are fixed from the bottom of the tree upwards, in order to rely on the original input sequences.
Reconstructed sequences containing an early stop-codon remained in the tree, but were not taken into account in the analysis.
12
MSA
Phylogenetic Tree
PreProcessing of Sequences
Mutation Positioning
Translation
Peptibase
Mutation Characterization
Mutations of each sequence with its direct descendant were noted in the DNA level.
Each such mutation was then associated with the matching amino acids in the translated sequences.
Mutation: C A
Between: AA1 in father
AA1 in son
Mutation: G -
Between: AA2 in father
AA1 in son
13
MSA
Phylogenetic Tree
PreProcessing of Sequences
Mutation Positioning
Translation
Peptibase
Mutation Characterization
All DNA sequences (input and intermediate) were:
translated to AA’s.
uploaded to the Peptibase server.
The Peptibase server was developed by our lab and is used to predict epitopes within AA sequences.
The analysis performed in Peptibase is conducted on the 31 most frequent HLA alleles, taking into account the allele frequency in the human population.
14
MSA
Phylogenetic Tree
PreProcessing of Sequences
Mutation Positioning
Translation
Peptibase
Mutation Characterization
Given an AA sequence, Peptibase uses 3 cut-offs on a
9-mer AA sliding window to predict its epitopes:
Cleavage by the Proteasome
Binding to TAP
Binding to MHC-I
For each 9-mer, cleavage,
TAP and MHC-I binding scores are computed.
9-mers passing all three stages are defined as epitopes.
15
MSA
Phylogenetic Tree
PreProcessing of Sequences
Mutation Positioning
Translation
Peptibase
Mutation Characterization
Some mutations in the nucleotide level may either affect the resulting amino acid
(replacement) or not (silent).
We defined 9 types of replacement mutations:
E2N
E2F
E2E
F2N
N2N
F2F
N2F
F2E
N2E
Epitope
PG R A FYATGEI T G DIR
N F E F N
16
MSA
Phylogenetic Tree
PreProcessing of Sequences
Mutation Positioning
Translation
Peptibase
Mutation Characterization
The mutation type is based on the original affiliation of the amino acid in the father sequence, and the new affiliation within the son sequence (whether it belonged to an epitope/flanking region or a nonepitope region).
E2N
F2N
N2N
E2F
F2F
N2F
E2E
F2E
N2E
For example, an E2N mutation occurred in a nucleotide which belonged to an epitope in the father sequence, and resulted in the loss of this epitope in the sons sequence.
17
Full Balance Per Nucleotide Affiliation
0.004
0.003
0.002
0.001
0
-0.001
-0.002
-0.003
-0.004
-0.005
-0.006
env gag nef pol rev tat vif vpr vpu
Epitope
Flanking
Non-epitope
Full Balance Calculation
Epitope:
Flanking:
Non-Epitope:
E2N + E2F – N2E – F2E
F2N + F2E – N2F – E2F
N2E + N2F – E2N – F2N
The results were normalized by the average length of the proteins.
18
In compliance with HLA polymorphism, all HIV proteins clearly tend to eliminate flanking regions.
For most proteins, the non-epitope balance is approximately 0, except for Nef and Vpu which accumulate epitopes more than others, and Rev and Vpr which remove epitopes.
In the epitope balance, most proteins (again, except for Rev and Vpr) create new epitopes instead of removing them.
An interesting point to notice is the total balance within epitope and flanking regions, where there is a tendency to remove cleavage sites by adding epitopes.
19
Transition Balance for HIV Proteins
0.003
0.002
0.001
0
-0.001
env gag nef pol rev tat vif vpr vpu
-0.002
-0.003
-0.004
E2N-N2E
E2F-F2E
F2N-N2F
The results were normalized by the average length of the proteins.
20
All HIV proteins tend to remove flanking regions, either completely or by creating a new epitope.
Rev and Vpr prefer to eliminate existing epitopes without creating new epitopes.
21
HBV proteins with multiple copies undergo selection against epitope presentation.
Pol is expressed in low levels and does not go through the same selection.
Epitope-reducing mutations in other proteins are at the expense of causing replacement mutations in the overlapping regions of Pol.
22
R/S is the ratio between the number of replacement and silent mutations.
The R/S ratio is significantly higher in regions with two reading frames, since there are few mutations that are simultaneously silent in the two reading frames.
23
0.6
No. of mutations affecting epitopes per 1000 bp in each father-son pair
0.5
0.4
0.3
0.2
0.1
0
0.9
0.8
0.7
Epitope Turnover
Overlapping (2 RF’s)
Non-overlapping (1 RF)
Pol-I/Pol-II c1/c2 x1/x2
Turnover Calculation:
E2N+N2E+N2F+F2N s
24
The epitope turnover is the number of mutations per 1,000 nucleotides either adding or removing an epitope between a father sequence and its son in the phylogenic tree.
In the non-overlapping regions of proteins C and X (one reading frame), there is a higher turnover than in overlapping regions.
In their overlapping regions (two reading frames), most mutations are not allowed due to functional constraints.
Pol, which is expressed in low levels and does not tend to remove epitopes, has a lower turnover in its nonoverlapping region. The higher turnover is seen in its overlapping region, due to mutations meant to affect the other genes.
25
The number of mutations affecting the cleavage sites was observed (epitope removing mutations per 1000 nucleotides in father-son pair in the phylogenetic tree).
The difference is significantly positive in practically all regions.
Net Decrease in the Number of Cleavage Sites
F2N–N2F
26
In order for a virus to survive in the presence of a CTL immune response, it must minimize the total number of exposed epitopes .
In HIV and HBV, there is a clear tendency to remove epitopes by eliminating cleavage sites . This may be the viral solution against the HLA polymorphism .
In HBV, there is a strong selection on Core, Surface and X to remove epitopes .
Core and X have an easier time mutating their nonoverlapping regions, since in the overlapping regions Pol is also affected.
Pol , having a low copy number, doesn’t try to remove epitopes and is therefore mainly affected in overlapping regions .
27
HIV removes cleavage sites by creating new epitopes .
A possible explanation:
The selection occurs only on the patient’s HLA alleles .
The other alleles not present in the host do not go through the same selection.
A mutation eliminating a cleavage site to avoid epitope presentation in the specific HLA allele, may create a new epitope in a different allele.
In HIV, Rev and Vpr remove epitopes while other proteins actually accumulate them.
28
Research further the phenomenon of cleavage site destruction producing new epitopes rather than non-epitope nucleotides.
Characterize the changes in the epitope density of a single HIV patient with known
HLA serotyping.
…
29
Thank you to:
Prof. Yoram Louzoun, for the dedicated guidance…
Kobi Maman and the whole lab, for all the help…
Prof. Ron Unger
Dr. Rachel Levy Drummer
Ariel Azia Amitai
30
Jonathan W. Yewdell, Eric Reits & Jacques Neefjes. 2003. Making sense of mass destruction: quantitating MHC class I antigen presentation. Nature
Reviews Immunology 3, 952-961.
Vider-Shalit, T., M. Almani, R. Sarid, and Y. Louzoun. 2009. The HIV hide and seek game: an immunogenomic analysis of the HIV epitope repertoire.
AIDS 23:1311-8. http://www.righto.com/theories/hiv_genes.html
http://www.avert.org/hiv-virus.htm
http://peptibase.cs.biu.ac.il/peptibase/ http://www.hiv.lanl.gov/content/index
31