Peptibase

advertisement

1

Motivation and Goals

HIV:

Many failed attempts have been made to develop vaccines against HIV.

This is due to the rapid mutation rate which enables the virus to evade immune recognition.

HBV:

The HBV genome has overlapping CDS’s.

Analyzing the implications of mutations affecting overlapping genes (with regard to HBV’s evolution and its interaction with the immune system) may also help us learn about similar viruses with overlapping genes

(e.g. HPV).

Our goal is to characterize changes and viral trade-off preferences in the epitope distribution along the CDS of different proteins.

2

Background – Antigen Presentation

An epitope is the part of the antigen that is recognized by the immune system.

Proteasome – degrades proteins within the cell into peptides about 9 AA long.

TAP – delivers cytosolic peptides into the ER, where they bind to MHC class I molecules.

MHC-I – found in every nucleated cell, function being to display fragments of proteins from within the cell to T-cells.

3

Background - HIV

HIV attacks CD4+ cells. The spikes on the surface of the virus particle stick to the CD4 and allow the viral envelope to fuse with the cell membrane.

Leaving the envelope behind after fusion, the viral RT converts its RNA genome into DNA. It is then transported to the cell nucleus, and is spliced and integrated into the human genome by the viral integrase.

HIV provirus may lie dormant within a cell for a long time. When the cell becomes activated, human as well as HIV genes are transcribed using human enzymes.

Then the messenger RNA is transported outside the nucleus, and is used as a blueprint for translation and replication.

4

Background – HIV Genes

Tat – controls transactivation of all HIV proteins.

Regulatory

Proteins

Rev – The differential regulator of expression of virus protein genes.

Nef – negative regulator factor, retards HIV replication.

Accessory

Proteins

Vif – infectivity factor gene.

Vpr – undetermined function.

Vpu – required for efficient viral replication and release.

Structural

Proteins

GAG – codes for various proteins necessary to protect the virus.

Has 3 parts: MA (matrix), CA (capsid), and NC (nucleocapsid).

POL – codes for the enzyme necessary for virus replication.

Has 3 parts: PR (protease), IN (endonuclease), and RT (reverse transcriptase).

ENV – the envelope of the virus. Has two parts: SU (surface envelope, gp120) and TM (transmembrane envelope, gp41).

5

Background - HBV

HBV is a small enveloped virus with partially double-stranded circular DNA genome. It is the only member of the hepadnaviridae family that infects human.

The HBV genome contains 4 main genes:

Core – encodes for the capsid protein.

Pol – encodes for a polymerase, with reverse transcriptase activity.

Surface – encodes for small, medium and large ER intermembrane proteins.

X – thought to have transcription regulation activity.

The HBV genome has 4

ORF’s – the entire Surface protein, the C-terminus of Core and the N-terminus of X overlap with Polymerase.

6

Background – Viral Epitopes

Previous works have shown that HIV tends to decrease the number of epitopes in regulatory proteins, which predominate in the initial stages of replication.

On the other hand, in HBV, the protein copy number more than the expression time seems to affect the epitope density.

7

HLA Polymorphism

The advantage in a mutation that removes an epitope is usually lost when the virus transfers to a new host with different HLA alleles.

Therefore, we expect a high turnover of mutations in potential epitopes in the new host during the transfer.

Mutations affecting the cleavage sites (flanking regions) are not dependent on the HLA allele and will therefore provide the virus with this advantage, also in the new host.

8

Algorithm

Multiple Sequence Alignment

Phylogenetic Tree

PreProcessing of Sequences

DNA-based Mutation Positioning Within the AA Sequences

Translation of DNA Sequences

Peptibase

Mutation Characterization for all Alleles

9

MSA and Phylogenetic Tree

The input DNA sequences are aligned using MUSCLE 3.6.

The sequences were retrieved from the LANL HIV Database.

MSA

Phylogenetic Tree

PreProcessing of Sequences

Mutation Positioning

Translation

Peptibase

Mutation Characterization

A genetically distant ‘Outgroup’ sequence is added to properly position the root of the tree and reconstruct the ancestral sequences.

The ‘Outgroup’ sequence for the HIV dataset was selected from SIV.

10

MSA and Phylogenetic Tree

MSA

Phylogenetic Tree

PreProcessing of Sequences

Mutation Positioning

Translation

Peptibase

Mutation Characterization

The alignment is used to build a phylogenetic tree using the Maximum

Parsimony method (Phylip

3.69).

The intermediate sequences built by the program reflect the changes that occurred within the coding sequence of the viral protein.

The phylogenetic tree shows the epitope development of the virus.

11

PreProcessing of Sequences

MSA

Phylogenetic Tree

PreProcessing of Sequences

Mutation Positioning

Translation

Peptibase

Mutation Characterization

The sequences reconstructed by the Phylip program may contain ambiguous nucleotides.

These nucleotides are fixed from the bottom of the tree upwards, in order to rely on the original input sequences.

Reconstructed sequences containing an early stop-codon remained in the tree, but were not taken into account in the analysis.

12

DNA-based Mutation Positioning

Within the AA Sequences

MSA

Phylogenetic Tree

PreProcessing of Sequences

Mutation Positioning

Translation

Peptibase

Mutation Characterization

Mutations of each sequence with its direct descendant were noted in the DNA level.

Each such mutation was then associated with the matching amino acids in the translated sequences.

Mutation: C  A

Between: AA1 in father

AA1 in son

Mutation: G  -

Between: AA2 in father

AA1 in son

13

Translation of DNA Sequences and Upload to Peptibase

MSA

Phylogenetic Tree

PreProcessing of Sequences

Mutation Positioning

Translation

Peptibase

Mutation Characterization

All DNA sequences (input and intermediate) were:

 translated to AA’s.

uploaded to the Peptibase server.

The Peptibase server was developed by our lab and is used to predict epitopes within AA sequences.

The analysis performed in Peptibase is conducted on the 31 most frequent HLA alleles, taking into account the allele frequency in the human population.

14

Peptibase

MSA

Phylogenetic Tree

PreProcessing of Sequences

Mutation Positioning

Translation

Peptibase

Mutation Characterization

Given an AA sequence, Peptibase uses 3 cut-offs on a

9-mer AA sliding window to predict its epitopes:

Cleavage by the Proteasome

Binding to TAP

Binding to MHC-I

For each 9-mer, cleavage,

TAP and MHC-I binding scores are computed.

9-mers passing all three stages are defined as epitopes.

15

Mutation Characterization

MSA

Phylogenetic Tree

PreProcessing of Sequences

Mutation Positioning

Translation

Peptibase

Mutation Characterization

Some mutations in the nucleotide level may either affect the resulting amino acid

(replacement) or not (silent).

We defined 9 types of replacement mutations:

E2N

E2F

E2E

F2N

N2N

F2F

N2F

F2E

N2E

Epitope

PG R A FYATGEI T G DIR

N F E F N

16

Mutation Characterization

MSA

Phylogenetic Tree

PreProcessing of Sequences

Mutation Positioning

Translation

Peptibase

Mutation Characterization

The mutation type is based on the original affiliation of the amino acid in the father sequence, and the new affiliation within the son sequence (whether it belonged to an epitope/flanking region or a nonepitope region).

E2N

F2N

N2N

E2F

F2F

N2F

E2E

F2E

N2E

For example, an E2N mutation occurred in a nucleotide which belonged to an epitope in the father sequence, and resulted in the loss of this epitope in the sons sequence.

17

Results – HIV (Full Balance)

Full Balance Per Nucleotide Affiliation

0.004

0.003

0.002

0.001

0

-0.001

-0.002

-0.003

-0.004

-0.005

-0.006

env gag nef pol rev tat vif vpr vpu

Epitope

Flanking

Non-epitope

Full Balance Calculation

Epitope:

Flanking:

Non-Epitope:

E2N + E2F – N2E – F2E

F2N + F2E – N2F – E2F

N2E + N2F – E2N – F2N

The results were normalized by the average length of the proteins.

18

Results – HIV (Full Balance)

In compliance with HLA polymorphism, all HIV proteins clearly tend to eliminate flanking regions.

For most proteins, the non-epitope balance is approximately 0, except for Nef and Vpu which accumulate epitopes more than others, and Rev and Vpr which remove epitopes.

In the epitope balance, most proteins (again, except for Rev and Vpr) create new epitopes instead of removing them.

An interesting point to notice is the total balance within epitope and flanking regions, where there is a tendency to remove cleavage sites by adding epitopes.

19

Results – HIV (Transition Balance)

Transition Balance for HIV Proteins

0.003

0.002

0.001

0

-0.001

env gag nef pol rev tat vif vpr vpu

-0.002

-0.003

-0.004

E2N-N2E

E2F-F2E

F2N-N2F

The results were normalized by the average length of the proteins.

20

Results – HIV (Full Balance)

All HIV proteins tend to remove flanking regions, either completely or by creating a new epitope.

Rev and Vpr prefer to eliminate existing epitopes without creating new epitopes.

21

Results – HBV (R/S Ratio)

HBV proteins with multiple copies undergo selection against epitope presentation.

Pol is expressed in low levels and does not go through the same selection.

Epitope-reducing mutations in other proteins are at the expense of causing replacement mutations in the overlapping regions of Pol.

22

Results – HBV (R/S Ratio)

R/S is the ratio between the number of replacement and silent mutations.

The R/S ratio is significantly higher in regions with two reading frames, since there are few mutations that are simultaneously silent in the two reading frames.

23

Results – HBV (Epitope turnover)

0.6

No. of mutations affecting epitopes per 1000 bp in each father-son pair

0.5

0.4

0.3

0.2

0.1

0

0.9

0.8

0.7

Epitope Turnover

Overlapping (2 RF’s)

Non-overlapping (1 RF)

Pol-I/Pol-II c1/c2 x1/x2

Turnover Calculation:

E2N+N2E+N2F+F2N s

24

Results – HBV (Epitope turnover)

The epitope turnover is the number of mutations per 1,000 nucleotides either adding or removing an epitope between a father sequence and its son in the phylogenic tree.

In the non-overlapping regions of proteins C and X (one reading frame), there is a higher turnover than in overlapping regions.

In their overlapping regions (two reading frames), most mutations are not allowed due to functional constraints.

Pol, which is expressed in low levels and does not tend to remove epitopes, has a lower turnover in its nonoverlapping region. The higher turnover is seen in its overlapping region, due to mutations meant to affect the other genes.

25

Results – HBV

The number of mutations affecting the cleavage sites was observed (epitope removing mutations per 1000 nucleotides in father-son pair in the phylogenetic tree).

The difference is significantly positive in practically all regions.

Net Decrease in the Number of Cleavage Sites

F2N–N2F

26

Conclusions

In order for a virus to survive in the presence of a CTL immune response, it must minimize the total number of exposed epitopes .

In HIV and HBV, there is a clear tendency to remove epitopes by eliminating cleavage sites . This may be the viral solution against the HLA polymorphism .

In HBV, there is a strong selection on Core, Surface and X to remove epitopes .

Core and X have an easier time mutating their nonoverlapping regions, since in the overlapping regions Pol is also affected.

Pol , having a low copy number, doesn’t try to remove epitopes and is therefore mainly affected in overlapping regions .

27

Conclusions

HIV removes cleavage sites by creating new epitopes .

A possible explanation:

The selection occurs only on the patient’s HLA alleles .

The other alleles not present in the host do not go through the same selection.

A mutation eliminating a cleavage site to avoid epitope presentation in the specific HLA allele, may create a new epitope in a different allele.

In HIV, Rev and Vpr remove epitopes while other proteins actually accumulate them.

28

Open Questions & Future Goals

Research further the phenomenon of cleavage site destruction producing new epitopes rather than non-epitope nucleotides.

Characterize the changes in the epitope density of a single HIV patient with known

HLA serotyping.

29

Acknowledgements

Thank you to:

Prof. Yoram Louzoun, for the dedicated guidance…

Kobi Maman and the whole lab, for all the help…

Prof. Ron Unger

Dr. Rachel Levy Drummer

Ariel Azia Amitai

30

Bibliography

Jonathan W. Yewdell, Eric Reits & Jacques Neefjes. 2003. Making sense of mass destruction: quantitating MHC class I antigen presentation. Nature

Reviews Immunology 3, 952-961.

Vider-Shalit, T., M. Almani, R. Sarid, and Y. Louzoun. 2009. The HIV hide and seek game: an immunogenomic analysis of the HIV epitope repertoire.

AIDS 23:1311-8. http://www.righto.com/theories/hiv_genes.html

http://www.avert.org/hiv-virus.htm

http://peptibase.cs.biu.ac.il/peptibase/ http://www.hiv.lanl.gov/content/index

31

Download