Influenza - Athens Academy

advertisement
Influenza
David L. Suarez
Southeast Poultry Research
Laboratory
Agricultural Research Service
U.S. Department of
Agriculture
Athens, Georgia
Influenza
• Orthomyxovirus
• Segmented genome
• Pleomorphic RNA viruses single stranded
• Three antigenic types: A, B, C
• Type A:
– Human influenza H1N1, H3N2,
pandemic H1N1
– Swine Influenza H1N1, H3N2
– Equine Influenza H3N8, H7N7
– Canine Influenza H3N8
– Avian Influenza(many bird
species) H1-H16, N1-N9
• Vary in pathogenicity
Influenza A Virus
10 or 11 influenza proteins
Neuraminidase
9 proteins packaged in virion
HA, NA, M2-surface proteins
NP, PA, PB1, PB2, M1 and NS2
internal proteins
16 HA subtypes
9 NA subtypes
Hemagglutinin
M2
PB1
PB2
PA
HA
NP
NA
MA
NS
M1
NS1 not packaged in virion
Influenza: Infection and Disease
• Infection may cause a wide range of clinical
signs from no disease (asymptomatic),
respiratory disease, to severe disease with high
mortality
• Localized Infection-mild to moderate disease
– Intestinal-wild ducks and shorebirds, poultry
– Respiratory-humans, swine, horses, poultry,
domestic ducks, seal, mink
• Systemic Infection-high mortality
– chickens, turkeys, other gallinaceous birds
Swayne, D.E. Epidemiology of Avian Influenza in Agricultural and Other
Man-Made Systems. In: Avian Influenza. Wylie-Blackwell
(www.blackwellpublishing.com), March, 2008.
Main Existing Influenza Lineages
Human influenza H3N2, H1N1
Equine/Canine
Influenza H3N8
Avian
Influenza
Swine influenza
H1N1, H3N2
Pandemics
of
influenza
Recorded human pandemic influenza
(early sub-types inferred)
H2N2
H2N2
H1N1
H1N1
H3N8
1895 1905
1889
Russian
influenza
H2N2
1915
Pandemic
H1N1
H3N2
1925
1900
Old Hong Kong
influenza
H3N8
1955
1918
Spanish
influenza
H1N1
1965
1957
Asian
influenza
H2N2
1975
1985
2010
2015
H9* 1999
H5 1997 2003
H7 1980
Reproduced and adapted (2009) with permission of Dr Masato Tashiro, Director, Center for Influenza Virus Research,
National Institute of Infectious Diseases (NIID), Japan.
2005
2009
Pandemic
influenza
H1N1
1968
Hong Kong
influenza
H3N2
Recorded new avian influenzas
1955
1995
1965
1975
1985
1996
1995
2002
2005
Animated slide: Press space bar
Genetic origins of the pandemic (H1N1)
2009 virus: viral reassortment
N. American H1N1
(swine/avian/human)
PB2
PB1
PA
HA
NP
NA
MP
NS
Unknown
lineage H1N1
PB2
PB1
PA
HA
NP
NA
MP
NS
Classical swine, N. American lineage
Avian, N. American lineage
Human seasonal H3N2
Unknown lineage (closest Eurasian
swine)
PB2
PB1
PA
HA
NP
NA
MP
NS
Pandemic (H1N1)
2009, combining
swine, avian and
human viral
components
Origin of Swine Flu?
Virus as a Parasite
• Viruses are very small and encode for
relatively few viral genes
• Require host genes to make viral RNA or DNA
and to package the virus
• RNA viruses are generally smaller than DNA
viruses (3000-30,000 bp)
• Most viruses infect a cell, cause the host cell
to make huge numbers of virus RNA, and
results in death of host cell
Viral Genes and Host Genes
• Host proteins are needed to make viral proteins from
viral mRNA
• Host proteins help to assemble the virus
• Viral genes usually make the viral RNA in the
polymerase complex- 4 flu proteins, NP, PA, PB2, and
PB1 are used to perform this function
• Viral proteins are used to attach to the host cells
(hemagglutinin protein) and exit host cells
(neuraminidase protein)
• Viral proteins are used to evade host immune
response (non-structural proteins)
General flu facts
• Influenza makes viral mRNA that is translated
into protein by the host cell
• Proteins start from the first ATG (methionine)
• Proteins end with any of the 3 stop codons
• The matrix and non-structural genes are
spliced into 2 proteins (M1, M2 and NS1, NS2)
• Host machinery processes proteins including
removing leader sequences and glycosylation
Influenza Virus Production
• Influenza has 8 gene segments
• Each segment must be packaged into virus to
be infectious
• How do you get all gene segments into virus?
• Each gene segment has conserved sequence
on 5’ and 3’ ends of segment
• 5’ end is 12 bp AGCAAAAGCAGG
• 3’ end is 13 bp CCTTGTTTCTACT
Flu facts
• Six of eight gene segments are strict on lengths of gene
segments
–
–
–
–
–
–
PB2
PB1
PA
NP
MA
NS
2341bp
2341bp
2233bp
1565bp
1027bp
890 bp
• No larger gene segment have ever been reported for
these genes (rare cases smaller)
• The hemagglutinin and neuraminidase genes are
exceptions with a lot of size variation
Influenza Genes
AGCGAAAGCAGG TCAAATAT ATTCAATATG
AGCGAAAGCAGG CAAACCAT TTGAATG
AGCGAAAGCAGG TACTGATT CAAAATG
AGCAAAAGCAGG GGTTCAAT CTGTCAAAATG
PB2 2341 bp
PB1 2341 bp
TAGTGTC GAATTGTTTA AAAACGA CCTTGTTTCTACT
TGAAAAAATG CCTTGTTTCTACT
PA 2233 bp
TAGTTGTGGCAATGCTACTATTTGCTATCCATACTGTCCAAAAAAGTA
CCTTGTTTCTACT
HA 1779 bp
TAGTTAAAAACAC CCTTGTTTCTACT
NP 1565 bp
AGCAAAAGCAGG GTAGATAA TCACTCACCGAGTGACATCC ACATCATG
AGCAAAAGCAGG AGTTCAAA ATG
AGCAAAAGCAGG TAGATATT GAAAGATG
AGCAAAAGCAGG GTGACAAA AACATAATG
NA 1450 bp
MA 1027 bp
NS 1565 bp
TAAAGAAAAATAC CCTTGTTTCTACT
TAGAAAAAAANT CCTTGTTTCTACT
TAGAGCTGGAGTAAAAAACTA CCTTGTTTCTACT
TGATAAAAAACAC CCTTGTTTCTACT
Nucleoprotein
Coding Sequence
AGCAAAAGCAGG GTAGATAA TCACTCACCGAGTGACATCC ACATCATG
TAAAGAAAAATAC CCTTGTTTCTACT
• Nucleoprotein
• 1565 base pairs in length
• Encodes a single protein of 498 amino acids
• Non-coding sequence is present before and after the
coding sequence
• Non-coding sequence acts as promoter and thought to
be important for virus assembly
Sequencing of Influenza Viruses
• Over 180,000 influenza gene sequences have
been deposited in GenBank representing over
50,000 isolates
• Many of these sequences are only partial gene
sequences that don’t include the non-coding
sequence
• Understanding non-coding sequences
contribution to pathogenesis of flu is important
• A rough estimation of 3% of flu sequences in
GenBank have serious errors
Errors in Flu sequence
• Gene segments are longer than they should be
and is likely the result of
–
–
–
–
Primer sequence was included as part of submission
For cloned genes, plasmid sequence was included
Taq polymerase induced errors
Sequence was poorly aligned and includes extra
sequence
• Sequence includes bad sequence that results in
insertions or deletions that result in premature
stop codons
GenBank Data Mining
• Using Influenza Research Database searched
for NP gene segments >1565 bp
• 266 isolates were greater than 1565 bp which
should be the maximum size
• Most if not all these sequences have errors
that is apparent on a multiple sequence
alignment
Bioinformatics Class Assignment
• Identify obvious mistakes in influenza
sequences
• Initially identify sequences with non-influenza
sequence on the 5’ or 3’ end of the gene
segments
• Characterize the types of errors that are
present and correlate that with the
laboratories that produce the sequence
Results
• Analyze the data from all eight gene segments
and publish the results in a peer reviewed
journal
• Contact the laboratories that have mistakes
and give them an opportunity to correct the
errors
• GenBank provides a relatively simple process
to correct sequence data
• Track which labs correct the data
Errors not so obvious
• RT-PCR amplification and sequencing of the
PCR product is commonly used
Primer
DNA
Viral RNA converted to ss DNA by
reverse transcriptase enzyme
Viral RNA
SS DNA transcribed to DS DNA
Primer
Primer
Primer
PCR used to amplify DS DNA that
can then be sequenced
Primer
PCR basics
Primer
Denature DS DNA to SS DNA at 94C
AGCGCTAGCTAGCTAGCGGCTAGCGTATCGAGCGTAGCGTAG
TCGCGATCGATCGATCGCCGATCGCATAGCTCGCATCGCATC
Anneal Primer to SS DNA 54C
AGCGCTAGCTAGCTAGCGGCTAGCGTATCGAGCGTAGCGTAG
AGCTCGCATCGCATC
AGCGCTAGCTAGCTA
TCGCGATCGATCGATCGCCGATCGCATAGCTCGCATCGCATC
Repeat the Denaturation, Annealing, and Extension for 30-40 cycles
Mismatches in Primer to Template Can
Still Result in PCR Amplification
AGCGCTAGCTAGCTAGCGGCTAGCGTATCGAGCGTAGCGTAG
AGCTGGCATCGCATC
AGCGCGAGCTAGCTA
TCGCGATCGATCGATCGCCGATCGCATAGCTGGCATCGCATC
Mismatches become incorporated in PCR product
AGCGCGAGCTAGCTAGCGGCTAGCGTATCGACCGTAGCGTAG
TCGCGATCGATCGATCGCCGATCGCATAGCTGGCATCGCATC
Sequenced PCR Product will include these errors
Conclusions
• Primers must be close but are not always identical to
template
• Primers may introduce errors into PCR product that will
show up in sequence
• Primer sequence should be removed when data is
submitted to GenBank
• Often it isn’t, and errors in sequence may be
introduced in GenBank database
• Errors in sequence makes it harder to understand what
sequence changes are important for viral infections
• GIGO-garbage in, garbage out
Influenza Sequencing
• Procedures are available to PCR amplify the
complete gene segment for eight genes
• Primers include conserved areas in the noncoding region including the 12 and 13 bp found in
all eight gene segments
• In addition to flu sequence, primers also contain
5’ extensions to improve PCR efficiency because
the sequences are so short
• These primer sequences are commonly not
removed before submission to GenBank
Error from Commonly Used Procedure
ACGTCGATCGCTTTCGTCC
AGCGAAAGCAGGTACTGATTCAAAATGCCGATCGCT
Primer sequence with 5’ extension
ACGTCGATCGCTTTCGTCCATGACTAAGTTTTACGGCTAGCGA
TGCAGCTAGCGAAAGCAGGTACTGATTCAAAATGCCGATCGCT
Primer extension incorporated in PCR
product
Sequence includes “extra” DNA that if not edited can get
submitted to GenBank
How to identify primer induced errors
• May not be possible by looking at sequence
directly
• Read the manuscript and look at experimental
detail (if they don’t have procedure
specifically sequencing ends, probably means
they have primer data in their sequence)
• Generate own non-coding sequence data and
compare that with GenBank sequence
Lethality and Molecular Characterization of an HPAI H5N1 Virus
Isolated from Eagles Smuggled from Thailand into Europe
M. Steensels, S. Van Borm, M. Boschmans, and T. van den Berg
Reverse transcription (RT) was performed using an RT primer specific
to a universal noncoding sequence present in all influenza segment RNAs
(Table 1; Unit 12) and AMV reverse transcriptase (Roche), according to
the manufacturer’s instructions, using 4 ll of purified RNA in a 20-ll
reaction volume. Overlapping gene fragments were polymerase chain
reaction (PCR)–amplified using Taq DNA polymerase (Roche) and a 2lM final concentration of gene-specific primers (Table 1) and 1 ll of
cDNA in 50-ll reactions. PCR was performed using the following
temperature profile: 4 min at 94 C, followed by 45 times the cycle (1 min
at 94 C, then 1 min at 55 C, and 1 min at 72 C). At the end, a final
elongation step of 10 min at 72 C was used. The size of the amplicons was
verified by agarose gel electrophoresis. Subsequently, amplicons of the
correct size were cloned into a pCR2.1-TOPO vector (TOPO TA
cloning kit; Invitrogen, Carlsbad, CA), according to manufacturer’s
instructions. The plasmid DNA from positive colonies was further
purified (Qiaprep miniprep kit; Qiagen, Valencia, CA), according to the
manufacturer’s procedures, and was verified by EcoRI (Roche, according
to manufacturer’s instructions) digestion and agarose gel electrophoresis.
Finally, sequencing reactions were performed using the M13F and M13R
primers (provided with the cloning kit) (BigDyeTerminator, version 3.1,
Select Extra Sequence for Blast
Analysis
Conclusions
• Test sequence had “extra” sequence on 5’ and 3’
end
• 5’ sequence is non-flu sequence added to primer
to improve PCR efficiency
• Review of published paper confirms data
• Original paper shows they cloned sequence
before sequencing
• Part of 3’ sequence appears to be plasmid
sequence
• Origin of remainder of 3’ sequence is unclear
How can you sequence the ends?
Convert SS linear RNA to
circular SS RNA
T4 RNA ligase will connect
RNA ends together
Do RT-PCR using primers that
cover the non-coding sequence
Purify and sequence PCR as normal
Download