A BIOINFORMATIC ANALYSIS OF THE MONONEGAVIRALES TRANSCRIPTION/REPLICATION COMPLEX THROUGH

advertisement
A BIOINFORMATIC ANALYSIS OF THE MONONEGAVIRALES
TRANSCRIPTION/REPLICATION COMPLEX THROUGH
THE DEVELOPMENT OF THE DISSIC PIPELINE
by
Sean Bruce Cleveland
A dissertation submitted in partial fulfillment
of the requirements for the degree
of
Doctor of Philosophy
in
Microbiology
MONTANA STATE UNIVERSITY
Bozeman, Montana
April, 2013
©COPYRIGHT
by
Sean Bruce Cleveland
2013
All Rights Reserved
ii
APPROVAL
of a dissertation submitted by
Sean Bruce Cleveland
This dissertation has been read by each member of the dissertation committee and
has been found to be satisfactory regarding content, English usage, format, citation,
bibliographic style, and consistency and is ready for submission to The Graduate School.
Marcella A. McClure
Approved for the Department of Microbiology
Mark Jutila
Approved for The Graduate School
Dr. Ronald W. Larsen
iii
STATEMENT OF PERMISSION TO USE
In presenting this dissertation in partial fulfillment of the requirements for a
doctoral degree at Montana State University, I agree that the Library shall make it
available to borrowers under rules of the Library. I further agree that copying of this
dissertation is allowable only for scholarly purposes, consistent with “fair use” as
prescribed in the U.S. Copyright Law. Requests for extensive copying or reproduction of
this dissertation should be referred to ProQuest Information and Learning, 300 North
Zeeb Road, Ann Arbor, Michigan 48106, to whom I have granted “the exclusive right to
reproduce and distribute my dissertation in and from microform along with the nonexclusive right to reproduce and distribute my abstract in any format in whole or in part.”
Sean Bruce Cleveland
April 2013
iv
DEDICATION
I dedicate this dissertation to my fiancé Jessica and my mother Shelby for their
undying support and understanding all these years.
v
ACKNOWLEDGEMENTS
I would like to thank the faculty, staff and other students at Montana State
University with whom I have worked for over a decade. Every one of them has had a
hand in shaping me into the professional and scientist I am today and this would not have
been possible without them.
I would also like to personally thank Dr. Marcella A. McClure, who has mentored
me in Bioinformatics, Virology and Evolution. Without her support, knowledge and gift
for teaching I would not have been inspired to come so far.
vi
TABLE OF CONTENTS
1. INTRODUCTION .......................................................................................................... 1
Summary ........................................................................................................................ 2
Background and Significance ......................................................................................... 2
The Four Families ....................................................................................................... 3
Bornaviridae ......................................................................................................... 3
Paramyxoviridae ................................................................................................... 5
Filoviridae ............................................................................................................ 6
Rhabdoviridae ...................................................................................................... 6
Vesicular Stomatitis Virus (VSV) - The Prototype of the Order ............................... 7
Pathology and Epidemiology ..................................................................................... 7
Vesicular Stomatitis Virus Particle ............................................................................ 9
RdRp Complex .......................................................................................................... 9
Nucleoprotein (N) .................................................................................................... 10
Phosphoprotein (P)................................................................................................... 12
Large Subunit Polymerase (L) ................................................................................. 13
Methods......................................................................................................................... 14
Multiple Sequence Alignments ................................................................................ 14
Phylogenetic Trees ................................................................................................... 15
Disorder and Consensus Prediction ......................................................................... 16
IUPRED ................................................................................................................... 17
Regional Order Neural Network (RONN) ............................................................... 17
DisEMBL ................................................................................................................. 18
PONDR .................................................................................................................... 19
PONDR Fit .............................................................................................................. 19
CORNET.................................................................................................................. 19
ConSEQ ................................................................................................................... 20
Xdet .......................................................................................................................... 20
Co-evolution Analysis using Protein Sequences (CAPS)........................................ 21
Identification of Co-evolution/Intra-Residue Contact
Predictions (CICPs) ................................................................................................. 22
2. A BIOINFORMATICS APPROACH TO THE STRUCTURE,
FUNCTION, AND EVOLUTION OF THE NUCLEOPROTEIN
OF THE ORDER MONONEGAVIRALES ................................................................. 23
Contribution of Authors and Co-Authors ..................................................................... 23
Manuscript Information Page ....................................................................................... 24
Abstract ......................................................................................................................... 25
Introduction ................................................................................................................... 26
Results ........................................................................................................................... 30
vii
TABLE OF CONTENTS - CONTINUED
Phylogenetic Analysis ............................................................................................. 30
Disorder Prediction.................................................................................................. 33
Co-evolution and Intra-residue Contact .................................................................. 36
Structural Analysis .................................................................................................. 41
Discussion ..................................................................................................................... 43
Phylogenetic Reconstruction ................................................................................... 43
Disorder ................................................................................................................... 45
Co-evolution and Intra-residue Contact .................................................................. 48
Structural Analysis .................................................................................................. 51
Materials and Methods .................................................................................................. 54
Phylogenetic Reconstruction ................................................................................... 54
Disorder ................................................................................................................... 57
Correlated Mutations and Intra-Residue Contact Prediction .................................. 57
Structural Analysis .................................................................................................. 58 Hydrophobic Residues and MSA Conservation: .................................................... 58
Acknowledgements:...................................................................................................... 59
References: .................................................................................................................... 60
3. DISORDER, INTRA-RESIDUE CONTACT AND COEVOLUTION
PREDICTION OF THE LARGE SUBUNIT POLYMERASE AND
PHOSPHOPROTEIN FOR THE ORDER MONONEGAVIRALES
USING THE DISICC PIPELINE ................................................................................. 66
Contribution of Authors and Co-Authors ..................................................................... 66
Manuscript Information Page ....................................................................................... 67
Abstract ......................................................................................................................... 68
Introduction ................................................................................................................... 68
Results ........................................................................................................................... 74
L Disorder Predictions............................................................................................. 75
P Disorder Predictions ............................................................................................. 78
Co-evolution and Intra-Residue Contacts ................................................................ 81
L CICP Results ........................................................................................................ 81
P CICP Results......................................................................................................... 83
Discussion ..................................................................................................................... 83
Disorder Prediction of L and P ................................................................................ 83
Co-evolution and Intra-Residue Contact for L and P .............................................. 87
Materials and Methods .................................................................................................. 92
Multiple Sequence Alignment .................................................................................. 92
Disorder .................................................................................................................... 96
Correlated Mutations and Intra-Residue Contact Prediction ................................... 96
Hydrophobic Residues and MSA Conservation ....................................................... 97
viii
TABLE OF CONTENTS - CONTINUED
References .................................................................................................................... 98
4. THE DISICC PIPELINE AND DISICC DATABASE .............................................. 105
Software Stack and Application.................................................................................. 105
Database Schema ................................................................................................... 107
Data Objects .......................................................................................................... 108
Data Visualization ................................................................................................. 109
Running the Pipeline ............................................................................................. 111
Quality Control ...................................................................................................... 112
Future Work ................................................................................................................ 113
Availability ................................................................................................................. 113
5. GENERAL CONCLUSION ....................................................................................... 114
Summary of the Study ................................................................................................ 114
Nucleoprotein Conclusions ................................................................................... 115
L Polymerase Conclusions .................................................................................... 115
Phosphoprotein Conclusions ................................................................................. 116
Conclusion .................................................................................................................. 117
REFERENCES CITED................................................................................................... 118
APPENDICES ................................................................................................................ 127
APPENDIX A: Supplementary Table 2.1 .................................................................. 128
APPENDIX B: Supplementary Figures For Chapter 3............................................... 132
APPENDIX C: Supplementary Table 3.1................................................................... 137
APPENDIX D: Supplementary Table 3.2 .................................................................. 151
ix
LIST OF TABLES
Table
Page
1.1
Mononegavirales ...................................................................................................4
S2.1 List of predicted Disordered and CICP residues for each viruses N protein ...129
S3.1 List of predicted Disordered and CICP residues for each viruses L protein .....138
S3.2 List of predicted Disordered and CICP residues for each viruses P protein .....152
x
LIST OF FIGURES
Figure
Page
1.1 Prototypic Genome for Mononegavirales. ...............................................................5
1.3 Schematic of VSV RNA Synthesis ........................................................................11
2.1 Phylogenetic reconstruction of 63 nucleoprotein
sequences of the order Mononegavirales. .............................................................31
2.2 Disorder and CICP mapped residues of Family MSAs. .........................................34
2.3 Entire Order Disorder and CICP mapped
residues on the MSA. .............................................................................................35
2.4 CICP Alignment Consensus Graphs ......................................................................38
2.5 Disorder an CICP mapped Crystal structures of the
Rabies Virus Nucleoprotein –RNA complex (2GTT) ............................................40
2.6 CICP and Disorder mapped Crystal structures of the
Rabies Virus Nucleoprotein-RNA complex (2GT)
subunit-Chain A .....................................................................................................41
2.7 Crystal structure of Vesicular Stomatitis Indiana Virus
nucleocapsid complex with the phosphoproteins’
nucleocapsid-binding domain (3HHW) .................................................................42
3.1 Disorder and CICP mapped residues
of Family MSAs for L ............................................................................................76
3.2 Disorder and CICP mapped residues of
the ORDER for L ...................................................................................................77
3.3 Disorder and CICP mapped residues of
Family MSAs for P.................................................................................................80
4.1 DisICC Application Organization ........................................................................107
4.2 DisICC Database and Object Schema ..................................................................109
xi
LIST OF FIGURES - CONTINUED
Figure
Page
4.3 Parallel Coordinates sample graph of the P
order result from DisICC .....................................................................................110
S3.1 Disorder Alignment Consensus Graphs for 63
L polymerase sequences ....................................................................................133
S3.2 Disorder Alignment Consensus Graphs for 63
P sequences.........................................................................................................134
S3.3 CICP Alignment Consensus Graphs for 37
L polymerase sequences ....................................................................................135
S3.4 CICP Alignment Consensus Graphs for 15
Paramyxovirinae P sequences ............................................................................136
xii
ABSTRACT
The viral members of the Order Mononegavirales are responsible for numerous
diseases with high mortality and few if any treatments. Unfortunately, knowledge of
these viruses is limited. Attempts to study the structure of the replication/transcription
complex of these viruses using physical methods like X-ray crystallography and NMR
spectroscopy have been largely unsuccessful due to the large size of this complex, as well
as the amount of disorder these proteins show when isolated.
The goal of this Bioinformatic study is to investigate sequence conservation in
relation to evolutionary function/structure of the nucleoprotein (N), large subunit
polymerase protein (L) and phosphoprotein (P) of the Order Mononegavirales. In the
combined analysis of 63 representative viruses from the four viral families
(Paramyxoviridae, Rhabdoviridae, Filoviridae, and Bornaviridae) were predicted using a
developed Disorder, Intra-residue contact and Compensatory mutation Correlator,
(DisICC) pipeline.
The N protein results indicate conservation for disorder in the C-terminus region
of the N viral proteins important for interacting with P and L during transcription and
replication. Portions of the N-terminus are responsible for N:N stability with interactions
identified by the presence or lack of co-evolving intra-protein contact predictions.
Correlations between location and conservation of predicted regions reveal strong
divisions between families while highlighting conservation within individual families in
L. Suggesting L Domains are conserved across the Order with strong intra-sequence
pressures for conservation, while hinge regions lack these pressures. Conserved disorder
is reported for: the amino-terminal of L for L-L complex formation across all families,
Domain V for capping activity across Paramyxovirinae and Vesiculovirus, and Domain
VI for cap methylation is conserved across Paramyxovirinae, Rubulaviruses,
Avulaviruses, Ferlavirus and Morbilliviruses. The P sequences show a strong
conservation of disorder within viral families that corresponds to their binding Domains
with little intra-sequence pressure.
Validation of these predictions by current experimental and structural information
illustrates the benefits of the DisICC pipeline for characterizing protein disorder and
intra-residue contact that can reveal likely residues as disruption targets in these viruses
that are infectious to humans.
1
INTRODUCTION
Summary
The Centers for Disease Control and Prevention have included the Ebola and
Marburg viruses, both negative-strand RNA viruses belonging to the order
Mononegavirales, in their list of Bioterrorism Agents/Diseases. However, structural
knowledge of these pathogens is limited. Mononegavirales (Table 1.1) is composed of
four viral families: Bornaviridae contains the Borna Disease Virus (BDV), which affects
the nervous system and the brain in many animals, including cows and rats, and
endogenous Borna-like nucleoprotein element sequences exist within the human genome
[1]. Paramyxoviridae includes Sendai Virus (SENV), which typically affects rats and
mice, and two viruses that cause childhood epidemics, Measles Virus (MeV) and Mumps
Virus (MuV). Filoviridae has only two members, Ebolavirus and Marburgvirus: that can
cause hemorrhagic fevers with mortality rates up to 90% in humans[2,3]. The
Rhabdoviridae contains Rabies virus (RABV) and Vesicular Stomatitis viruses, both of
which are capable of animal to human transmission, as do many Mononegavirales.
Vesicular Stomatitis virus (VSV) is the model for the Rhabdoviridae family and the
prototype for most studies of transcription and replication for the entire order of
Mononegavirales [4]. VSV and Rabies are also used in therapies for cancer and
experimental vaccines against Human Immunodeficiency virus and influenza [5-7].
Negative-strand RNA viruses are unique in that their RNA genomes are always
encapsidated by a viral coded nucleoprotein to form a ribonucleoprotein (RNP) complex.
2
This complex serves as the template for viral RNA synthesis and forms the structural core
of the viruses when packaged into virions [8]. The RNP is formed concurrently with
transcription/replication by the viral RNA-dependent RNA polymerase (RdRp). For all of
Mononegavirales, the RdRp complex is composed of the negative-sense RNA genome
and three proteins: nucleoprotein (N), phosphoprotein (P) and the large subunit
polymerase protein (L). The RNA genome of this complex is always found associated
with the nucleoprotein as the RNP. This structure is resistant to nucleases, even during
synthesis [9,10]. The nucleoprotein, not only important for the encapsidation of the RNA
for transcription, has also been identified in interactions with itself, the large subunit
polymerase protein, and phosphoprotein for the generation of mRNAs in protein
expression [11].
This Chapter will focus on providing context of the unique viral Order of
Mononegavirales, its families, and the prototypic virus Vesicular Stomatitis virus. It will
conclude with an overview of the in silico techniques used to study the representative
viral members of the Order and provide consensus metrics of protein interaction sites, an
important aspect of the structure/function paradigm.
Background and Significance
The viral members of Mononegavirales are responsible for many diseases with
high mortality rates and few if any treatments. Unfortunately, the lack of structural
knowledge impedes the successful development of anti-viral strategies. Attempts to
study the structure of the replication/transcription complex of viruses of the order
Mononegavirales using physical methods like X-ray crystallography and NMR
3
spectroscopy have been largely unsuccessful due to the large size of this complex, as well
as the amount of disorder these proteins show when isolated. Although, these viruses
lack structural information they are well sequenced and provide a solid foundation for
bioinformatics studies.
Hence, these viruses provide an excellent test group for
predictive methods that focus on using sequence information independent of other data.
The Four Families
Mononegavirales is currently composed of four viral families: Bornaviridae,
Filoviridae, Paramyxoviridae, and Rhabdoviridae (Table 1.1). All Mononegavirales
possess negative-sense single-stranded monopartite RNA genomes with similar
transcription/replication complexes.
Bornaviridae:
Bornaviridae contains the Borna disease virus (BDV), which
affects the nervous system and the brain in many vertebrate species, primarily warmblooded animals, and is characterized by neurotropic noncytophathic replication and
persistent infection [12-17].
Originally BDV was the only known member of
Bornaviridae, but in 2008 Avian borna virus was identified [18-20]. Evidence suggests
that BDV infects humans and causes certain mental disorders [15,21,22]. BDV differs
from other viruses in the Order in that its localization for transcription is in the nucleus of
the infected cells, rather than in the cytoplasm as the other members [14].
Table 1.1 Mononegavirales. The four families are named in the header and the
corresponding sub-families (underlined), genera (bolded) and species.
4
Bornaviridae
Bornavirus
Borna disease virus
Rhabdoviridae
Cytorhabdovirus
Northern cereal mosaic virus
Lettuce necrotic yellow virus
Ephemerovirus
Bovine ephemeral fever virus
Lyssavirus
Australian bat lyssavirus
Rabies virus
Mokola virus
Novirhabdovirus
Snakehead virus
Viral hemorrhagic septicemia virus
Infectious hematopoietic necrosis virus
Hirame rhabdovirus
Nucleorhabdovirus
Maize mosaic virus
Maize fine streak virus
Rice yellow stunt virus
Sonchus yellow net virus
Taro vein chlorosis virus
Vessiculovirus
Spring viremia of carp virus
Vesicular stomatitis Indiana virus
Vesicular stomatitis San Juan virus
Vesicular stomatitis New Jersey virus
Chandipura virus
Isfahan virus
Siniperca chuatsi rhabdovirus
Paramyxovirinae
Paramyxovirina
Avulavirus
Avian paramyxovirus 6
Goose paramyxovirus
Newcastle disease virus
Ferlavirus
Fer-de-lance virus
Henipavirus
Nipah virus
Hendra virus
Morbilivirus
Canine distemper virus
Phocine distemper virus
Dolphin morbillivirus
Peste-de-petit ruminants virus
Measles virus
Rinderpest virus
Respirovirus
Human parainfluenza virus 1
Sendai virus
Bovine parainfluenza virus 3
Human parainfluenza virus 3
Rublavirus
Mumps virus
Tioman virus
Menangle virus
Simian parainfluenza virus 41
Human parainfluenza virus 2
Simian parainfluenza virus 5
Unclassified
Tupaia paramyxovirus
Mossman virus
Beilong virus
J virus
Pneumovirinae
Pneumovirus
Human pneumovirus
Avian pneumovirus
Human respiratory syncytial virus A2
Human respiratory syncytial virus B1
Human respiratory syncytial virus S2
Respiratory syncytial virus
Bovine respiratory syncytial virus
Meatapneumovirus
Pneumonia virus of mice 15
Pneumonia virus of mice J3666
Filoviridae
Ebolavirus
Reston ebolavirus
Sudan ebolavirus
Zaire ebolavirus
Marburgvirus
Lake Victoria marburgvirus
5
The X protein is a nonstructural protein 87 amino acids in length [23,24] and its
expression has been shown to be tightly regulated by translational and transcriptional
mechanisms [25,26]. The X protein is an important regulator for viral RNA synthesis
and polymerase complex assembly [27], and
recombinant viruses encoding an
inactivated X gene or an X protein without a functional P-binding domain were shown to
be not viable [28].
Figure 1.1 Prototypical Genome for Mononegavirales based on Vesicular stomatitis. The
gene product are: the nucleoprotein (blue), phosphoprotein (green), matrix protein
(orange), glycoprotein (yellow) and large polymerase subunit (yellow). Processive
transcription creates the gene products in highest concentration in for the 3’ end for the
nucleoprotein with lower concentrations further along the genome.
Paramyxoviridae:
Paramyxoviridae
consists
of
two
sub-families,
Paramyxovirinae and Pneumovirinae. Paramyxovirinae has six genera in this study:
Rubulavirus, Avulavirus, Ferlavirus, Henipavirus, Morbillivirus and Repriovirus. This
family contains a range of members from Sendai virus, which typically infects rats and
6
mice, to viruses that cause childhood epidemics such as Measles and Mumps.
Parainfluenza viruses and respiratory syncytial virus (RSV) cause respiratory infections,
while Morbilliviruses, including Measles and Mumps, cause systemic infections. All of
the Paramyxoviruses are transmitted through the respiratory route, making them highly
contagious. The virions of this family consist of an envelope, a nucleocapsid, and
multiple copies of a matrix protein. Virions are spherical to pleomorphic, and can range
from 150-300 nm in diameter [29]. The envelope has spike-like projections spaced
widely apart that evenly cover the surface and are embedded in a lipid bilayer [30-32].
The nucleocapsid is 600-1000 nm, depending on genus, 13-18 nm in diameter, and has
helical symmetry [29]. The virions attach to the surface of a host cell, and the envelope
fuses to the plasma membrane. The nucleocapsid is released into the cell. The negativesense RNA is transcribed into individual messenger RNAs and a positive-sense RNA
template, which is used to create new negative-sense RNA. Assembly occurs, and new
viruses bud from the cell membrane, incorporating host lipids into the envelope [30-32].
Filoviridae:
Filoviridae consists of two major genera, the Ebolaviruses and
Marburgviruses, which cause hemorrhagic fevers that have mortality rates up to 90% in
humans. Ebolavirus is endemic to Africa and to the Philippines. In contrast to the three
other viral families, Ebolavirus has virions that are filamentous. Infectious Ebola virions
are usually 920 nm in length, 80 nm in diameter, and have a membrane stolen from the
host cell by budding [33]. The protein-coding genes in the genomes are N (major
nucleoprotein), VP35 (phosphoprotein), VP40 (matrix protein), GP (glycoprotein), VP30
(minor nucleoprotein), VP24 (secondary matrix protein), and L (RNA-dependent RNA
7
polymerase). These viruses have a transcriptional gradient from N to L consistent with
other viruses of the Order [34]. Filoviruses are estimated to have diverged less than
10,000 years ago, which coincides with the rise of agriculture in human history [35].
Amongst humans, Ebola is transmitted by contact with infected bodily fluids and/or
tissues. However, there is evidence of a possible respiratory route of transmission of
Ebola in nonhuman primates [36].
Rhabdoviridae: The Rhabdoviridae family contains Rabies and Vesicular
stomatitis viruses, which are both, as are many Mononegavirales, able to pass from their
animal hosts to cause disease in humans. Rhabdoviridae contains six genera:
Vesiculovirus, Lyssavirus, Ephemerovirus, Norvirhabdovirus, Cytorhabdovirus, and
Nucleorabdovirus. The virions are enveloped, bullet shaped and approximately 75 nm
wide and 180 nm long. The genome codes for the proteins in the order 3’-N, P, M, G and
L-5’ (Fig 1.1) [8]. Infection involves the attachment of the viral glycoproteins to the host
receptors that results in clathrin-mediated endocytosis of the virion into the host cell. The
virion membrane then fuses with the vesicle membrane and the ribonucleocapsid is
released into the cytoplasm [8,37]. Transcription occurs in the cytoplasm just as the other
members of the Order. When the virus has successfully replicated, the ribonucleocapsids
bind to the matrix proteins and bud via the endosomal sorting complex require for
transport (ESCRT) [38].
Vesicular Stomatitis Virus (VSV) The Prototype of the Order: VSV has become the general prototype/model for the
Rhabdoviridae family, and its transcription and replication are a model for the entire
8
order of Mononegavirales [4,10]. VSV has also emerged as a tool for molecular biology
and immunology as a vector for the development of experimental vaccines for a host of
diseases, including HIV, as well as use in anti-tumor therapy [5,39]. Increased
understanding of the protein structures and interactions of the replication and
transcription complex of VSV not only improve our understanding of this virus, but could
further therapeutic utilization of other members of Rhabdoviridae and even
Mononegavirales.
Pathology and Epidemiology: VSV is normally associated with livestock, and the
two serotypes that are most commonly documented for the epidemics are VSV New
Jersey and VSV Indiana. The New Jersey serotype is responsible for many of the
epidemics within the United States [40]. The disease appears in livestock as vesiculation
and/or ulceration of the tongue, oral tissues, hooves or teats and results in substantial loss
of productivity and body mass. Symptoms are localized during infection with multiple
sites of infection being uncommon. In culture, cells display viral products that turn off
cellular gene expression and hijack the metabolic processes of the cell. Further, these
viral products depolymerize the cytoskeleton and are responsible for rapid tissue
destruction. In animals infected with VSV, the virus triggers interferon and nitric oxide
responses from the host that result in controlling viral replication. This immune response
also results in the production of antibodies that prevents further viral replication. The
antibody memory has been reported to last for over eight years post infection. Except for
its presentation in horses, VSV is indistinguishable from foot-and-mouth disease.
9
However, unlike foot-and-mouth disease VSV can be very infectious for humans and can
cause debilitating symptoms during infection [39,41].
VSV outbreaks are seasonal in the southeastern USA, southern Mexico, Central
America and northern South America. Migration of the virus has been observed from
tropical areas causing sporadic epidemics in the temperate climates during the summer
months. VSV is arthropod-borne in organisms such as the biting Midge Culicoides
sonorensis [41]. VSV outbreaks are relatively random, which contributes to the virus’s
inability to become established in the US. Additionally, there is no long-term reservoir
for maintenance of the viral population further preventing the virus establishing a
foothold in the US [40].
Vesicular Stomatitis Virus Particle:
VSV is an enveloped, non-segmented
negative single-stranded RNA virus. The virion has the characteristic bullet shape of
Rhabdoviruses and is 70 nm in diameter and 180 nm long [42]. The VSV genome
consists of 11,161 nucleotides and is composed of five genes, which are the
nucleoprotein (N) 47 kDa, the phosphoprotein (P) 30 kDa, the matrix (M) 26 kDa, the
glycoprotein (G) 57 kDa, and the large subunit of the polymerase (L) 241 kDa [43,44]
The genome is organized 3’-leader-N-P-M-G-L-5’ and the prevalence of each gene
product is relative to its order, with N having the highest concentration of mRNA and L
the lowest (Fig 1.1) [45].
RdRp Complex: For all of Mononegavirales, the RdRp complex is composed of
the RNA genome and three proteins: N, P and L. The RNA genome of this complex is
10
always found associated with the N protein. The N:RNA coupling protects all of the
approximated 11,000 base RNA from ribonuclease digestion (Fig 1.2). The large size of
the L:P:N/RNA complexes of approximately 1200 N, 400 P, and 50 L proteins is beyond
the limits of current structure determination methods (i.e., X-ray crystallography or NMR
spectroscopy), which was a reason the bioinformatic approach was undertaken. For VSV
specifically, the entire ribonucleoprotein (RNP) complex contains approximately 1258
molecules of the N protein, each of which is bound to nine bases of RNA [42]. The large
polymerase subunit L and the phosphoprotein P are the two essential viral components in
the polymerase [46]. Viral transcription and replication by VSV are distinct processes
that are defined in part by the level of the N protein in the cell. The N protein is initially
in complex with the P protein preventing the concentration-dependent aggregation of N.
This keeps the N protein from encapsidating non-specific RNA transcripts during
replication [47]. An illustration of the model for RdRp subunit interactions during
transcription for VSV is shown in Figure 1.2.
Nucleoprotein (N): The nucleoprotein has been identified to have three basic
properties. The first is that it binds to the RNA genome to protect it from ribonucleases
[43]. The second is that it polymerizes to cover the entire length of the genome. The
third is that N requires association with P to encapsidate the RNA, preventing the
aggregation of the N proteins [48]. Crystal structure evidence now exists as an isolated
90 nucleotide strand of RNA associated with 10 copies of the nucleoprotein of VSV [49].
11
Figure 1.2. Schematic of Vesicular stomatitis virus RNA Synthesis. The L (gray and
shaped like a number six) and P (green) proteins interact with the RNP (N in blue and the
RNA is represented by a black line) to transcribe the 5 individual mRNAs of the genome
and also replicate the genome by creating the positive sense RNA template.
The researchers observed that the RNA exists tightly bound in a cavity that
provides a hydrophobic space to accommodate the bases of the RNA. This cavity exists
at the interface between two lobes in the N protein with nine nucleotides associated with
each N molecule. The structure of the RNA-nucleoprotein complex also showed a
number of interactions between neighboring N protein molecules where each protein is in
contact with three neighboring N molecules [48]. A further study has shown that these
neighboring lobe interactions provide more stability than the positively charged residues
of the RNA binding cavity [50]. These discoveries have also added evidence that the
mechanism for RNA synthesis occurs as a portion of the N protein temporarily
dissociates from the RNA with the active polymerase complex. This evidence also
discredits the other two hypothetical models: because N would prevent access to several
12
positions of the RNA, so no Watson-Crick base pairing could take place; and the RNP
remains intact after one round of RNA synthesis, dispelling the idea that the
nucleoprotein completely dissociates from the RNA.
Phosphoprotein (P): The phosphoprotein in the RdRp, assists N in recognition
and encapsidation of the RNA genome, allowing L to specifically recognize the N-RNA
template and progress along it. Studies have shown P-deficient rabies viruses are unable
to replicate [51]. VSV P contains three Domains: Domain I contains the N-terminus
(residues 1-137) and is responsible for influencing transcription and binding L; Domain II
(residues 211-244) is important to replication and binds to L’s C-terminus; Domain III is
the C-terminus (residues 245-265) and binds the N-RNA template in two positions
[47,52,53]. P has been shown to form a dimer with the central oligermerization domain
between residues 107-177.
Sendai virus has also been demonstrated to form
homotetrameric oligomers [54].
The P protein in VSV has nine identified
phosphorylation sites and that been observed to be important for replication of the virus
[55]. Residue 179 has been linked to ATP utilization and is the switch that modulates
interaction between the N-RNA complex and the L polymerase for transcription and
replication [4,37]. P has three conserved domains and a hinge region [56]. Domain I has
been identified as the amino-terminal acidic domain and is phosphorylated by casein
kinase II at residues Ser 60, Thr 62, and Ser 64 [57]. Domain I located at residues ~210244 is phosphorylated at residues Ser 226 and Ser 227 by a kinase associated with the L
protein and is necessary for replication [58]. It has also been shown that the C-terminal
13
domain of the VSV P protein is important for its complex formation with both the N and
L proteins [56]. The L protein is also stabilized by interaction with the P protein [59].
Large Subunit Polymerase (L): The L protein has been largely characterized by
the studies of VSV, and Sendai virus (SENV). L is the largest of the VSV genes and is
the catalytic component of the RdRp. The L polymerase of VSV is approximately 6.3 kb
in length and encodes a protein of 2109 amino acids[44].
There are six conserved
Domains in L and they are shared among all L proteins of the Order [60]. Domain I in
Sendai Virus (SENV) been shown to interact with P and the P-N0 complex during
encapsidation of nascent RNA during replication.
Domain II contains conserved
charged motifs that play a role in template binding in SENV [61]. Domain III contains
the RdRP activity, and it is also required for polyadenylation, which occurs through
polymerase slippage on a template U tract [62].
The capping activities of the
Mononegavirales L polymerase located in Domain V are different from other viruses and
their hosts. Specifically, an RNA:GDP polyribonucleotidyltransferase activity present
within Domain V transfers 5′ monophosphate RNA onto a GDP acceptor through a
covalent L–pRNA intermediate. The resulting mRNA cap is subsequently modified by a
dual specificity methyltransferase activity within Domain VI where ribose 2′-O
methylation precedes and facilitates subsequent guanine-N-7 (G-N-7) methylation [63].
The region between 1638-1673 in Domain VI has been shown to be involved in binding
the phosphoprotein through a deletion mutant of this region that failed to bind P [63].
These domains influence each other functionally, as failure to cap the nascent RNA chain
results in the premature termination of transcription, and blocking methylation results in
14
hyperpolyadenylation. These latter observations demonstrate that the 5′ mRNA
processing activities of L intimately regulate its nucleotide polymerization activity and
suggest that the 3D arrangement of the functional domains likely serves a key regulatory
role during RNA synthesis.[64]
Currently, there are no crystal or NMR structural datasets available for the entire
L or any region of L. However, studies using negative stain electron microscopy (EM)
have obtained a molecular view of L alone, and in complex with the viral P cofactor. EM
analysis, combined with proteolytic digestion and deletion mapping, revealed the
organization of L into a ring domain containing the RNA polymerase and an appendage
of three globular domains containing the cap-forming activities (Fig 1.2) [64]. The
capping enzyme maps to a globular domain, which is juxtaposed to the ring, and the cap
methyltransferase maps to a more distal and flexibly connected globule. Upon P binding,
L undergoes a significant rearrangement that may reflect an optimal positioning of its
functional domains for transcription. The structural map of L provides new insights into
the interrelationship of its various domains, and their rearrangement on P binding that is
likely important for RNA synthesis. Because the arrangement of conserved regions
involved in catalysis is homologous, the structural insights obtained for VSV L likely
extend to all negative non-segmented (NNS) RNA viruses [64].
Methods
Multiple Sequence Alignments:
The multiple sequence alignments for each
family were created by submitting the sequences to the MAFFT ver.6 server
15
(http://mafft.cbrc.jp/alignment/server/index.html) using the E-INS-i strategy that uses a
generalized affine gap cost and is applicable to difficult problem such as RNA
polymerase. Each family alignment was manually curated to ensure optimal alignments.
For the alignment of the entire order, each independent family alignment was organized
into one FASTA file and submitted to the MAFFT ver. 6 alignment server using the EINS-i strategy [26]. The multiple sequence alignment (MSA) output was then manually
curated due to the wide divergence of the sequences. These alignments were then
uploaded into DisICC to create the corresponding alignment, sequence, and amino acid
objects needed for analysis in the pipeline.
Phylogenetic Trees: The family and order alignments for the N, P and L protein
sequences were the input for MrBayes3.1 [29, 30] and BEASTv1.5.4 & 1.7.2 [60] for the
generation of the phylogenetic trees (only the alignments for the N sequences were run
through MrBayes). The parameters used for MrBayes3.1 were a mixed amino acid
model, eight category gamma distribution rate, and 1,000,000 generations of the Markov
Chain Monte Carlo analysis. In our studies, our knowledge of the family classifications of
the sequences were used to design four constraints. It should be noted that although the
constraint parameter was invoked for the trees, MrBayes3.1 overrides any constraint if
the data does not support it. It has been previously explored that MrBayes3.1, with
appropriate constraints, produced trees with higher confidence at each node than other
tree methods: neighbor-joining, minimum evolution, maximum parsimony, and the unweighted pair group method with arithmetic mean[61]. For each of the protein
16
comparisons, BDV was the outgroup due to its significant divergence from the other
families.
The BEASTv1.5.4 and 1.7.2 trees were created using two independent Bayesian
MCMC chains (10 million steps, 10% burn-in) run under the WAG amino acid
substitution model [62] and rate heterogeneity among sites (four category gamma
distribution rate). Monophyletic taxon sets consisting of Filoviridae, Rhabdoviridae and
Paramyxoviridae were also used in the models.
Disorder and Consensus Prediction: Over the last decade, the dogma that proteins
require discrete structure to be functional has been systematically changed by evidence
that unstructured protein regions are just as important as those with well-defined tertiary
structure in their native state [65,66].
These proteins are classified as intrinsically
unstructured proteins (IUPs) or disordered. Unstructured proteins can range from being
fully disordered proteins, to a generally folded state with both long and short disordered
sections.
This disorder is associated with a number of functions, including cell-cycle
regulation, signal transduction, and transcription. Additionally, these disordered regions
permit functional flexibility to interact with multiple binding partners. The functionality
of these IUPs is often triggered by the binding of a partner or target ligand. This docking
induces formation of secondary structure and the disorder confers fast interaction and
specificity (or multiple recognitions) without excess binding strength.
It has also been
suggested that disordered proteins provide a simple solution to having large
intermolecular interfaces while keeping smaller protein, genome, and cell sizes [67].
17
The result of each of the protein predictions methods was evaluated for each
amino acid position and then converted to a 0 or 1, corresponding to “not disorder” or
“disordered”, respectively. This step is necessary as the threshold for disorder differs for
DisEMBL based on the sequence itself and does not conform to the 0.5 threshold of
disorder of the other methods.
Therefore, each result was normalized based on the
method’s reported threshold. For each amino acid, a consensus value was calculated by
averaging the set of scores, from each disorder calculation method, at that location: the
resulting value is stored as the disorder consensus value in the corresponding amino acid
object within DisICC.
IUPRED: This disorder prediction method is independent of presumed structure
as it relies only on pairwise energy calculations. These energy calculations are based on
an amino acid energy predictor matrix. The energy and amino acid composition for each
position is calculated by considering interaction partners 2-100 residues away. The
position specific estimation of energies are average over a window of 21 residues and
reported as the final result [68].
IUPred predictions were run for both long and short
disorder settings.
Regional Order Neural Network (RONN): This disorder predictor uses a Neural
Network trained on sequences of known folding states (order, disordered, or a mixture of
both). However, unlike other disorder methods that employ neural networks, RONN [69]
focuses on individual amino acids: rather than representing the sequences in a feature
space such as hydrophobicity and charge, or according to known properties, RONN uses
18
‘distances’ (determined by sequence alignment) from a subset of well-characterized
prototype sequences.
These distances are calculated and the training of the neural
network is performed in this ‘distance’ space. This is called the bio-basis function neural
network (BBFNN) method [69,70].
Since the length of disordered/ordered regions
varies, the BBFNN uses the concept of non-gapped homology alignment to maximize the
alignment score between pairs of sequences [71]. This allows the prototype sequences to
have different lengths, although they must be at least as long as a pre-defined window
size, and sub-sequences for a query sequence (of this window size and centered on each
residue in turn) are then aligned to all the prototypes. The resulting homology scores are
used for statistical pattern recognition to give a probability of disorder for each query
sequence window, and these scores are averaged to give a probability of disorder for each
residue in the query sequence.
DisEMBL: DisEMBL is a disorder prediction method based on artificial neural
networks trained for predicting several definitions of disorder [72]. The Disorder, Intraresidue contact and Compensatory mutation Correlator, (DisICC) uses two of the disorder
definitions from DisEMBL (Loops/coils and Hot loops). Loops/coils as defined by DSSP
[73]. The definition of an ordered sequence state is considered as α-helix, 310-helix or βstrand, and all other states as loops/coils. Loops/coils are not necessarily disordered,
however protein disorder is only found within loops. One can use loop assignments as a
necessary but not necessarily sufficient requirement for disorder. Hot loops are a refined
subset of loops/coils, those loops with a high degree of mobility as determined from C-α
temperature factors (B-factors). Dynamic loops should be considered protein disorder.
19
PONDR: PONDR functions from primary sequence data alone. The predictors
are feed-forward neural networks that use sequence information from windows of 21
amino acids. Attributes, such as the fractional composition of particular amino acids or
hydropathy, are calculated over this window, and these values are used as inputs for the
predictor. The neural network, which has been trained on a specific set of ordered and
disordered sequences, then outputs a value for the central amino acid in the window. The
predictions are then smoothed over a sliding window of 9 amino acids. If a residue value
exceeds a threshold of 0.5 (the threshold used for training) the residue is considered
disordered.
The Nucleoprotein predictions were run using the VL-XT predictor. VL-XT
integrates three feed-forward neural networks: the VL1 predictor[74] and the N- and Cterminal predictors (XT) [75].
PONDR Fit: This meta-predictor makes use of the PONDR disorder prediction
algorithms that use neural networks trained on different disorder sets.
Additional
disorder methods are combined with the PONDR results into a meta-prediction of protein
disorder. This method was chosen to replace the standard PONDR method as it is freely
available via a web service.
CORNET: This method is a neural network based predictor that uses as input
correlated mutations, sequence conservation, predicted secondary structure and
evolutionary information [76-79]. This predictor uses a feed-forward neural network
trained with a standard back-propagation algorithm [80] to associate protein single
20
sequences to their corresponding contact maps from a database of contacts. Five different
networks of increasing input complexity are used to train the network, including ordered
couples of residues, evaluation of hydrophobic residue neighborhood, conservation
weight of alignments for evolutionary information, and additional hydrophobic residue
information for a three residue window.
After training, sequences are entered into the
neural network, and the resulting intra-protein contact positions are output in CASP
format.
ConSEQ:
This method of intra-protein residue contact prediction requires a
multiple sequence alignment (MSA) as input in FASTA format. From the MSA a
neighbor-joining tree is generated and conservation scores for each site are calculated
using empirical Bayesian scoring [81]. The conservation scores are a relative measure of
evolutionary conservation at each sequence site of the query sequence.
Xdet: This method is intended to locate positions in a MSA, which are related to
the functional classification of the proteins, ideally when the functional classes can be
related by a hierarchy, or distances between them can be defined. The theory is as
follows: at a particular amino acid location within an MSA, a dramatic amino acid
change between two proteins would be correlated with a high functional difference
between these proteins. Likewise, similar amino acids imply functional similarities. For
each position in the alignment, a matrix quantifying the amino acid changes for all pairs
of proteins is constructed based on a substitution matrix. In this matrix, a given entry
represents the similarity between the residues of two proteins at that position. An
21
equivalent matrix is constructed from an external explicit functional classification where
each entry represents the ‘functional similarity’ between the corresponding proteins (for
the functional feature we are interested in). These two matrices are compared with a
Spearman rank-order correlation coefficient. So, for a multiple alignment of proteins, the
similarities between the amino acids of different proteins at the same position in the
alignment are calculated. Positions with >10% gaps are excluded from the calculations.
To prevent bias and under-sampling, a constraint was applied requiring all MSA to have
10 or more sequences having greater than 19 percent identity and less than 90 percent
identity.
Coevolution Analysis using
Protein Sequences (CAPS): The CAPS method identifies co-evolving amino acid
site pairs by measuring the correlated evolutionary variation at these sites. Evolutionary
variation is measured using time-corrected Blosum values for the transition between two
amino acids at a particular site from the MSA output. The transition between two amino
acids at each site is corrected by the divergence time of the sequences. The time is
estimated as the mean number of substitutions per synonymous site between the two
sequences being compared [82]. Correlation of the mean variability is measured using the
Pearson coefficient. Finally, the significance of the correlation coefficients is estimated
by comparing the real correlation coefficients to the distribution of re-sampled correlation
coefficients. Only co-evolving sites parsimony information is considered. Further, a stepdown permutational procedure is applied to correct for multiple testing and nonindependence of data [83]. CAPS also performs a preliminary analysis of compensatory
22
mutations by testing the correlation for hydrophobicity as well as in the molecular weight
variations between co-evolving amino acids. These calculations are performed on both
intra and inter-protein alignments [82]. For inter-protein calculations, two MSA with
identical organizations or corresponding protein partners are required with the protein of
interest appearing first the MSA. As with Xdet, an additional constraint was applied: all
MSA must have 10 or more sequences with greater than 19 percent identity and less than
90 percent identity.
Identification of Co-evolution/Intra-Residue
Contact Predictions (CICPs): CICPs were generated using results from
CORNET, ConSEQ, XDET and CAPS. Each amino acid in a sequence was evaluated
for positive results from the four methods listed. If an amino acid was found to have
three or more positive results it was classified as a CICP within its object. In evaluation
of the conservation of CICPs amongst the families and Order, at a given alignment
position, each sequence with a valid CICP calculation was averaged to create a consensus
CICP score.
23
A BIOINFORMATICS APPROACH TO THE STRUCTURE, FUNCTION, AND
EVOLUTION OF THE NUCLEOPROTEIN OF THE ORDER
MONONEGAVIRALES
Contribution of Authors and Co-Authors
Manuscript in Chapter 2
Author: Sean B. Cleveland
Contributions: Conceived and designed the experiments, performed the experiments,
analyzed the data and wrote the paper.
Co-Author: John S. Davies
Contributions: Performed the experiments and wrote the paper.
Co-Author: Marcella A. McClure
Contributions: Conceived and designed the experiments, analyzed the data and wrote the
paper
24
Manuscript Information Page
Sean B. Cleveland, John Davies, and Marcella A. McClure
PLOS One
Status of Manuscript: (Put an x in one of the options below)
____ Prepared for submission to a peer-reviewed journal
____ Officially submitted to a peer-review journal
____ Accepted by a peer-reviewed journal
__X_ Published in a peer-reviewed journal
PLOS One
Submitted: September 3, 2010
Published: April 1, 2011
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0019275
25
Abstract
The goal of this Bioinformatic study is to investigate sequence conservation in
relation to evolutionary function/structure of the nucleoprotein of the order
Mononegavirales. In the combined analysis of 63 representative nucleoprotein (N)
sequences from four viral families (Bornaviridae, Filoviridae, Rhabdoviridae, and
Paramyxoviridae) we predict the regions of protein disorder, intra-residue contact and coevolving residues. Correlations between location and conservation of predicted regions
illustrate a strong division between families while highlighting conservation within
individual families. These results suggest the conserved regions among the
nucleoproteins, specifically within Rhabdoviridae and Paramyxoviradae, but also
generally among all members of the order, reflect an evolutionary advantage in
maintaining these sites for the viral nucleoprotein as part of the transcription/replication
machinery. Results indicate conservation for disorder in the C-terminus region of the
representative proteins that is important for interacting with the phosphoprotein and the
large subunit polymerase during transcription and replication.
Additionally, the C-
terminus region of the protein preceding the disordered region, is predicted to be
important for interacting with the encapsidated genome. Portions of the N-terminus are
responsible for N:N stability and interactions identified by the presence or lack of coevolving intra-protein contact predictions. The validation of these prediction results by
current structural information illustrates the benefits of the Disorder, Intra-residue contact
and Compensatory mutation Correlator (DisICC) pipeline as a method for quickly
26
characterizing proteins and providing the most likely residues and regions necessary to
target for disruption in viruses that have little structural information available.
Introduction
The Centers for Disease Control and Prevention have included the Ebola and
Marburg viruses, both negative-strand RNA viruses belonging to the order
Mononegavirales, in their list of Bioterrorism Agents/Diseases, however, structural
knowledge of these agents is limited.
Mononegavirales is composed of four viral
families: Bornaviridae contains the Borna Disease Virus (BDV), which affects the
nervous system and the brain in many animals, including cows and rats, and endogenous
borna-like nucleoprotein elements sequences exist within the human genome[1].
Paramyxoviridae includes Sendai Virus (SENV), which typically affects rats and mice,
and two viruses that cause childhood epidemics, Measles Virus (MeV) and Mumps Virus
(MuV). Filoviridae has only two members, Ebolavirus and Marburgvirus that cause
hemorrhagic fevers with mortality rates up to 90% in humans[2, 3]. The Rhabdoviridae
contains Rabies Virus (RABV) and Vesicular Stomatitis viruses , which are both able to
pass from their animal hosts to cause disease in humans, as do many Mononegavirales.
Vesicular Stomatitis virus (VSV) is the model for the Rhabdoviridae family, and the
prototype for most of the investigation of transcription and replication for the entire order
of Mononegavirales[4]. VSV and Rabies are also used in therapies for cancer and
experimental vaccines against Human Immunodeficiency Virus and influenza[5-7].
27
Negative-strand RNA viruses are unique in that their RNA genomes are always
encapsidated by a viral coded nucleoprotein to form a ribonucleoprotein (RNP) complex.
This complex serves as the template for viral RNA synthesis and forms the structural core
of the viruses when packaged into virions[8]. The RNP is formed concurrently with
transcription/replication by the viral RNA-dependent RNA polymerase (RdRp). For all of
Mononegavirales, the RdRp complex is composed of the negative-sense RNA genome
and three proteins: nucleoprotein (N)(review Longhi 2009), phosphoprotein (P) and the
large subunit polymerase protein (L). The RNA genome of this complex is always found
associated with the nucleoprotein as the RNP. This structure is resistant to nucleases,
even during synthesis[9, 10]. The nucleoprotein, not only important for the encapsidation
of the RNA for transcription, has also been identified in interactions with itself, the L
polymerase and phosphoprotein for the generation of mRNAs in protein expression. [11]
The nucleoprotein plays a critical role by polymerizing to cover the entire length of the
genome, thereby protecting it from ribonuclease digestion. [12] This encapsidation
requires association with the phosphoprotein to be chaperoned to the RNA preventing the
concentration-dependent aggregation of nucleoproteins to each other. This association
also keeps the N protein from encapsidating non-specific RNA transcripts during
replication [13-15]. The nucleoproteins of bovine and human RSV viruses are able to
form nucleocapsid-like structures in the absence of RNA and the other viral proteins [16,
17]. Crystal structure evidence now exists for the nucleoproteins of VSV, RABV, BDV
and Respiratory Syncytial Virus (RSV).
The VSV crystal was isolated with a 90-
nucleotide strand of RNA associated with 10 copies of the nucleoprotein forming a
28
truncated RNP in the shape of a cylinder/ring [18]. The RNA was shown to exist tightly
bound in a cavity that provides a hydrophobic space to accommodate the bases of the
RNA.
In RSV this cavity exists within a groove at the N-N interface with seven
nucleotides associated with each nucleoprotein subunit[19]. The structure of the VSV
RNA-nucleoprotein complex also shows a number of interactions between neighboring
nucleoproteins; each one is in contact with three neighboring N molecules forming a
tetramer[20]. A comparison of the structures of the nucleoproteins of BDV, RABV and
influenza A virus show that the topology of the RNA binding region from the three
nucleoproteins is very similar and highlights common structural domains. The
nucleoproteins each contained at least five conserved helices in the N-terminal domain
and three in the C-terminal domain [21].
The current proposed mechanism for VSV RNA synthesis suggests that a portion
of the nucleoprotein temporarily dissociates from the RNA allowing the polymerase
access to the genome. This is supported by the crystal structure of the nucleoprotein from
VSV that shows the neighboring lobe interactions provide more stability than the
positively charged residues of the RNA binding cavity[22]. This work also provides
evidence that structurally N would prevent access to several positions of the RNA, so no
Watson-Crick base pairing could take place, and the RNP remains intact after one round
of RNA synthesis, dispelling the idea that the nucleoprotein completely dissociates from
the RNA during replication/transcription. Additionally, a model of RSV RNA synthesis,
based on nucleocapsid-like helical assemblies, suggests that the polymerase can induce
29
hinge movement of the N-terminal domain to the C-terminal domain.
This hinge
movement would result in a transient opening of the groove allowing RNA access[19].
The use of Bioinformatic methods has been implemented to produce models of
the individual intra-protein contacts and disorder for the nucleoprotein in the study
presented here. The results of protein disorder prediction, correlated mutations, sequence
conservation, and intra-residue prediction methods have been correlated to characterize
the nucleoproteins based on the data these approaches generate from the protein sequence
information. The purpose of evaluating the regions of disorder within a protein is that
such areas are observed to be binding sites for protein-ligand interactions. Upon
association with the partner ligand the protein assumes a secondary structure as observed
using x-ray crystallography[23, 24]. The flexibility that disorder imparts allows these
proteins to have multiple binding partners as well as multiple functions based upon
confirmation. Since the nucleoprotein interacts with the RNA genome, phosphoprotein
and polymerase it is likely these regions or interaction are disordered residues that
disorder prediction methods will highlight. The application of correlated mutation and
intra-protein contact predictors assume that evolutionary functional constraints are
expected to limit the amino acid substitution rates, resulting in a higher conservation of
structural/functional sites with respect to the rest of the protein. Once a residue is
changed, given the constraints operating on it, this mutation can be compensated with an
additional mutation of a corresponding residue elsewhere in the protein that may be in
close proximity when folded to maintain the interaction. This enables the co-evolution of
the two residues that can lead to both high specificity and affinity. These assumptions can
30
be expanded to include inter-protein residue pairs as well as protein–nucleic acid
interactions[25-27]. The knowledge of these important residues aids in modeling protein
structures when combined with additional information derived from the disorder
prediction and sequence conservation. The resulting predictions provide sites that can be
pursued for point mutations and inhibition within the nucleoprotein to interfere with viral
transcription/replication.
Results
Phylogenetic Analysis
To explore the relationship of the evolution of the nucleoprotein within the viral
families and among the entire order a phylogenetic reconstruction was implemented. The
multiple alignment of all 63 N sequences was generated by manual curation of a MAFFT
alignment[28] that was then used as the input for MrBayes3.1[29, 30]. The results of a
MrBayes3.1 tree (results not shown) grouped BDV with the Filoviruses, which was
different from the most recent tree created using portions of the polymerase [31]. In
order to increase the confidence in this placement BEASTv1.5.4 analysis was performed
and confirmed the overall MrBayes results. This tree was rooted at the midpoint and
reveals three major clades (Fig 2.1). Clade I is BDV and Filoviridae, Clade II contains
Paramyxoviridae and Clade III is Rhabdoviridae; all clades show posterior probabilities
(PP) of 1.
31
Figure 2.1. Phylogenetic reconstruction of 63 nucleoprotein sequences of the order
Mononegavirales. The BEASTv1.5.4 tree was created using two independent Bayesian
MCMC chains (10 million steps, 20% burn-in) run under the WAG amino acid
substitution model[62] and rate heterogeneity among sites (gamma distribution with 4
categories). Monophyletic taxon sets consisting of Filoviridae, Rhabdoviridae and
Paramyxoviridae were also used in the model. The posterior probabilities label each
node and branch lengths are scaled to expected substitutions per site. Clade I consists of
BDV and Filoviridae, Clade II contains Paramyxoviridae and Clade III is Rhabdoviridae.
Brackets indicate virus families: Bornaviridae, green, Filoviridae, orange,
Paramyxoviridae, blue and Rhabdoviridae, red. Unassigned viruses are denoted by stars
colored by the family they are unassigned in.
32
Examination of Clade I reveals that BDV clades with Filoviridae at a PP of 0.98. The
Filoviruses group with each other and Lake Victoria Marburgvirus (MARV) branches
from the Ebolaviruses at a PP of 1.
Clade II shows Paramyxoviridae branching into the subfamilies Paramyxovirinae
and Pneumovirinae (Fig 2.1). Within the subfamily Pneumovirinae all genera group
with PPs of 0.95-1.0. Bovine Respiratory Syncytial Virus (BRSV) sits outside the human
viruses with a PP of 1. The Paramyxovirinae subfamily branches into two subclades.
The first contains the Rubulavirus, Avulaviruses with the unclassified Tioman Virus
(TIOV). The Rubulaviruses and Avulaviruses relationships are highly supported by PP of
1 throughout their topology. TIOV groups within the Rubulaviruses. The second is made
up of Respirovirus, Henipaviruses, Morbilliviruses and the five unclassified viruses: Ferde-lance Virus (FDLV), Tupaia Virus (TUPV), Mossman Virus (MOSV), Beilong Virus
(BEIV), and JV with a PP of 1. FDLV is an outgroup to the Henipaviruses and
Morbilliviruses at a PP of 0.81. Both MOSV and TUPV group with Henipaviruses with
PPs of 0.86 respectively. With a low PP of 0.53, BEIV and J Virus (JV) form their own
group outside the Morbillivirues. The Morbilliviruses and Respiroviruses resolve
relationships with PPs from 0.8-1.0.
Examination of the Rhabdoviridae in Clade III reveals high PPs across all genera.
Within Clade III there are two subclades.
The first subclade is composed of the
Ephemroviruses, Vesiculoviruses and Lyssaviruses. The currently unassigned Flanders
Virus (FLAV) branches with Bovine Ephemeral Fever Virus (BEFV) with a PP of 1
suggesting it belongs to the Ephemeroviruses. Siniperca Chuatsi Rhabdovirus (SCRV)
33
groups between the Ephemeroviruses and the other Vesiculosviruses with a PP of 0.99.
Lyssaviruses are an outgroup to the Ephemeroviruses and Vesiculoviruses with a PP of
1.0. The second subclade contains the Cytorhabdoviruses, Nucleorhabdoviruses and the
Novirhabdovirues.
The Novirhabdoviruses are an outgroup to the plant viruses
Cytorhabdoviruses and Nucleorhabdoviruses at a PP of 0.96.
Disorder Prediction
To identify potential residues that could be involved in inter-protein binding
protein disorder prediction programs were applied to the nucleoprotein sequences and
combined into a consensus prediction.
The results of the four disorder predictions
programs (PONDR[32-34], IUPred[35, 36], DisEMBL[37], and Disopred[38]) were
normalized and averaged for each amino acid residue of the nucleoprotein sequences into
a consensus prediction value. Those values were mapped onto the Multiple Sequence
Alignments (MSAs) of each of the four viral families’ nucleoproteins to observe if there
is any pattern in the location of disordered regions (Fig 2.2). The Bornaviridae sequence
displays four regions of disorder with the largest being in the N and C-terminals (Fig
2.2A, Table S1A). Filoviridae sequences contain four distinct regions of disorder with
the largest being in the C-terminus. These sequences also contain the largest region of
disorder of the entire order averaging over 200 consecutive residues in length beginning
just downstream from residue 400 in the MSA (Fig 2B, Table S1B).
Paramyxoviridae displays a pattern of four regions of disorder at residues ~15-50,
~150-180, ~205-225, and after residue 400 in the MSA. Paramyxovirinae exhibits a
34
Figure 2.2. Disorder and CICP mapped residues of Family MSAs. A.) Bornaviridae B.)
Filoviridae C.) Paramyxoviridae D.) Rhabdoviridae. Each family was aligned according
to the process outlined in the methods section and ordered based on the results of the
phylogenetic tree (Fig 1). Each residue is represented by a colored column tick
corresponding to Disorder, CICP, both Disordered and CICP or neither a CICP or
Disordered residue. Disordered residues are colored by an increase from yellow, being
lowest confidence of disorder, to red, highest confidence of residue disorder. CICPs are
shown in blue. Residues predicted to be both Disordered and a CICP are highlighted in
green. Residues that have neither a Disorder or CICP prediction are represented in grey.
Gaps in the alignment are represented in white. The black ticks at the bottom of the
alignment denote residue position and occur every 25 residues. The color of the brackets
to the left of the alignment indicate virus families: Bornaviridae, green, Filoviridae,
orange, Paramyxoviridae, blue and Rhabdoviridae, red. Unassigned viruses are denoted
by stars colored by the family they are unassigned in.
35
Figure 2.3. Entire Order Disorder and CICP mapped residues on the MSA. All
sequences analyzed in the study were aligned using the process described in the methods
and put into order according to phylogenetic tree results (Fig1). Each residue is
represented by a colored column tick corresponding to Disorder, CICP, both Disordered
and CICP or neither a CICP or Disordered residue. Disordered residues are colored by an
increase from yellow, being lowest confidence of disorder, to red, highest confidence of
residue disorder. CICPs are shown in blue. Residues predicted to be both Disordered and
a CICP are highlighted in green. Residues that have neither a Disorder or CICP
prediction are represented in grey. Gaps in the alignment are represented in white. The
black ticks at the bottom of the alignment denote residue position and occur every 25
residues. The color of the brackets to the left of the alignment indicate virus families:
Bornaviridae, green, Filoviridae, orange, Paramyxoviridae, blue and Rhabdoviridae, red.
Unassigned viruses are denoted by stars colored by the family they are unassigned in.
36
majority of disorder beyond the 400th residue in the MSA (Fig 2.2C, Table S1C).
Pneumovirinae has a significantly smaller region of disorder in the C-terminus compared
to the other sequences of Paramyxovirinae (Fig 2.2C). Rhabdoviridae sequences display
three regions of disorder with the largest concentration of disordered residues at the Cterminus (Fig 2.2D, Table S1D). The two smaller regions of disorder are in the first half
of the proteins. One is within first 100 residues of the amino terminus and the other
approximately
between
residues
150-250
of
the
MSA
(Fig
2.2D).
The
Nucleorhabdoviruses, Cytorhabdoviruses and Novirhabdoviruses display a larger
concentration of disorder in these regions compared to the rest of Rhabdoviridae (Fig
2.2D). Disorder for the entire order’s sequences exhibit three general regions of disorder
with the highest concentration of consecutively disordered amino acids predicted to be at
the C-terminus of the proteins (Fig 2.3).
Co-evolution and Intra-residue Contact
To extract information about the structural and functionally important residues
that are constrained by intra-protein evolutionary pressures the results of four prediction
programs were combined into a consensus prediction. The results of the two intra-residue
contact predictors, ConSEQ[39], and CORNET[40, 41] were combined with the two
coevolving residue mutation predictors, XDET[38, 42] and CAPS[43] and the result is
referred to as the Co-evolution/Intra-residue contact prediction (CICP) consensus. CICPs
were observed for 36 of the 63 viral nucleoprotein sequences from Rhabdoviridae, and
Paramyxoviridae subfamily Paramyxovirinae, while Bornaviridae and Filoviridae could
not be analyzed (Fig 2.2A&B). These sequences were not analyzed due to lack of
37
meeting the pair-wise identity criterion of 19-90%. The four prediction methods require
a MSA to have a minimum of 10 sequences meeting this criterion to produce statistically
significant results. The twenty-four Paramyxovirinae sequences that met the analysis
criteria display CICPs throughout the length of the sequence. The C-terminal regions of
the proteins contain few, if any, predicted CICPs in the region containing a high
concentration of disordered residues (Fig 2.2C). However, there is a distinct CICP
pattern of highly conserved residues at positions ~286-323 and ~360-416, and moderately
conserved residues at 225-261 throughout the Paramyxovirinae (Fig 2.4A). There is a
distinct area of residues that are both disordered and CICPs especially in TIOV,
Rubulaviruses, Henipaviruses, BEIV, JV and Morbilliviruses. The residues that display
disorder and CICP also correlate with hydrophobic residues and higher MSA
conservation as observed in Jalview [44]. Residues ~360-416 contain the largest number
of CICPs in the sequences correlating with the highest concentration of hydrophobic
residues as well as high conservation scores. Additional smaller patterns of CICPs are
observed at residues ~45 and ~112-130 with lower percentages of conservation in the
MSA. CICPs that flank a distinct region of disorder are observed at _110-130 and ~225.
Areas displaying lower frequencies of CICPs also were observed to have lower levels of
hydrophobic residues and lower MSA conservation scores.
38
Figure 2.4. CICP Alignment Consensus Graphs A.) Paramyxovirinae MSA. B.)
Rhabdoviranae MSA. C.) Order MSA. The number of CICPs occurring for a position of
the analyzed MSA was summed and divided by the total number of sequences that could
participate in the CICP study from that alignment (Paramyxovirinae had 24 sequences,
Rhabdoviranae has 12 sequences and the Order had 36 sequences). The y-axis is the
percentage of residues predicted to be a CICP and the x-axis is the residues position in
the MSA. The threshold of 50% was set to define a position as showing significant
conservation of a predicted CICP and is plotted in Red. The CICP percentages are
plotted in blue.
Twelve sequences meeting the analysis criteria among the Rhabdoviridae for
Lyssavirus, Ephemerovirus, and Vesiculovirus could be used to estimate CICPs. The
CICPs appear throughout the alignment and there is a dearth of correlation with predicted
39
contacts in the disordered C-terminus region (Fig 2.2C). There are three short regions of
high CICP conservation within the MSA observed at _170-186, 351-367 and 431-473
(Fig 2.4B). These contacts also correlate with pockets of hydrophobic residues and MSA
sequence conservation.
Examining the MSA of the entire order reveals two regions with high
concentrations of conserved CICPs at ~382-426 and ~447-522 (Fig 2.3, 2.4C).. These
regions correlate with higher frequencies of hydrophobic residues. There does not appear
to be a pattern for regions of residues predicted to be both disordered and CICPs
observable outside of the Paramyxovirinae.
Structural Analysis
To provide a structural perspective of how the disordered regions and CICPs
correlate with the nucleoprotein crystal structures solved in the last few years we mapped
the results of the predictions onto these 3D structures. Using the crystal structure for the
RABV nucleoprotein complex (pdb id - 2GTT)[45] from the Research Collaboratory for
Structural Bioinformatics (RCSB) protein database repository with the Chimera
molecular viewer[46] the disorder and CICPs were mapped to the structure by coloring
the residues.
Figure 2.5A and 2.5C shows the disordered regions of a RABV
nucleoprotein located mainly at the periphery of the folded structure in loop regions
corresponding to residues 378-401, 411-429 and 443-450 (Table S1D). Figure 2.5,
panels B and D, highlight the CICPs that appear primarily within the interior of the
protein where many residues show contact with distant residues. Figure 2.6 displays both
40
Figure 2.5. Disorder and CICP mapped Crystal structures of the Rabies Virus
Nucleoprotein-RNA complex (2GTT). A.) Nucleoprotein-RNA ring-complex cavity view
mapped with disordered residues in yellow. B.) Nucleoprotein-RNA ring-complex cavity
view mapped with CICP residues in blue. C.) Nucleoprotein-RNA ring-complex side
view mapped with disordered residues in yellow. D.) Nucleoprotein-RNA ring-complex
side view mapped with CICP residues in blue. Structure is missing information for
residues 1-6, 104-118, 185-187 and 373-397. Residues 1-2, 104-109, 378-396 are
predicted to be disordered.
41
Figure 2.6. CICP and Disorder mapped Crystal structures of the Rabies Virus
Nucleoprotein-RNA complex (2GTT) subunit-Chain A. A.) subunit-ChainA from cavity
view. B.) subunit-ChainA from a side view orientation. Residues predicted to be
disordered are in yellow, coevolving in blue and those predicted to be both disordered
and coevolving in green. Structure is missing information for residues 1-6, 104-118, 185187 and 373-397.
42
Figure 2.7. Crystal structure of Vesicular Stomatitis Indiana Virus nucleocapsid
complexed with the phosphoprotein’s nucleocapsid-binding domain(3HHW). A.) 5
nucleoproteins colored green and cyan alternating to make them easily distinguishable
and 5 nucleoprotein-binding domains of the phosphoprotein colored in magenta and
purple. The predicted disordered residues are highlighted in yellow. The predicted
disordered nucleoprotein residues 354-367 are shown in contact with the binding domain
of the phosphoprotein. B.) Two nucleoproteins and two phosphoproteins. Chain K and L
are nucleoproteins colored green and cyan. Chains A and B are phosphoproteins colored
magenta and purple. The blue circle is highlighting the N-terminus of the nucleoprotein
and the blue squares indicate residues 354 and 367 on each N chain. Predicted disordered
residues are highlighted in yellow
43
the disordered and CICPs of a single nucleoprotein and shows where they overlap near
the C-terminus.
It should be noted that the crystal structure is missing structural
information for residues 373-397, which are predicted to be disordered and residue, 383,
is also predicted a CICP.
For a more specific look at the nucleoprotein interaction with the phosphoprotein
a recent crystal structure of the Vesicular Stomatitis Indiana Virus (VSIV) N:RNA & P
complex (pdb id – 3HHZ) [22] was mapped with disorder predictions for the
nucleoprotein (Fig 7). The disordered region from residues 356-369 of the nucleoprotein,
chain K, appeared to be in contact with the phosphoprotein, chain A.
To confirm the
residues were indeed in contact a MolProbity analysis of all-atom-contact[47] was
performed. The MolProbity results confirm that the phosphoprotein, chain A, residues
~214-219 and ~253-262 are in contact with the nucleoprotein, chain K, at residues 356369. These correlations provide validation that the DisICC pipeline is a quick approach
for suggesting which residues are involved in intra and inter-protein interactions when
little is known about structure.
Discussion
Phylogenetic Reconstruction
The results of the BEASTv1.5.4 tree is consistent with previously published
relationships of the order (Fig 1) [48, 31]. From the tree structure it appears that BDV
and Filoviridae are closer to each other than they are to Rhabdoviridae or
Paramyxoviridae (Fig 1). This is an interesting finding as a recent tree of the order using
44
portions of the polymerase group BDV with Rhabdoviridae [31]. However, the branch
length of BDV within Clade I is long indicating that it still distant from Filoviridae. This
result, produced by both MrBayes3.1 and BEASTv1.5.4, is strong evidence that the
nucleoprotein of BDV does not clade with Rhabdoviridae.
The Rhabdoviridae sequences in Clade III are organized into their respective
genera as expected (Fig 2.1). The relationship of FLAV with the Ephemeroviruses is
supported by percent identity calculation of the two nucleoprotein sequences of FLAV
and BEFV (36.38%), which indicate they are closer to one another than to any other
sequence in the study. This result is consistent between BEASTv1.5.4 and MrBayes3.1
analyses.
The phylogenetic reconstruction of the Paramyxovirinae subfamily reveals some
clear relationships of the previously unclassified viruses. Menangle Virus (MENV) and
the unclassified TIOV branch together within the Rubulavirus.
The association of
MENV with the Rubulaviruses is supported by earlier molecular characterization and
phylogenetic analysis [49].
The unclassified virus FDLV is an outgroup to the
Henipaviruses and Morbilliviruses. Previous results agree with this observation as the
nucleoprotein gene FDLV was shown to branch between the Henipaviruses,
Rubulaviruses and Morbilliviruses[50].
MOSV and TUPV group between the
Henipaviruses and Morbilliviruses. The relationship of MOSV and TUPV grouping is
supported by previous phylogenetic work and the results from this study agree with the
previous N results [51]. The nucleoprotein of BEIV and JV viruses group together
45
between the Henipaviruses and Morbilliviruses is supported by previous phylogenetic
analysis [52].
Disorder
Disordered or intrinsically unstructured proteins (IUPs) are able to exist without a
defined secondary structure. It has been shown that these IUPs can assume a secondary
structure after interacting with their binding ligand. Such regions of disorder within
proteins are observed to be binding sites for proteins assuming a secondary structure that
is observed under x-ray crystallography when in association with the partner ligand [23,
24]. When unassociated from a binding-ligand these disordered regions are often absent
from crystal structures. Disordered regions allow proteins to have many binding partners
and different functions based upon the conformations. The results from the disorder
predictions reveal the C-terminus of the Mononegavirales viral nucleoproteins contain
the largest portion of disordered residues (Fig 2.2E Table S1E). This illustrates the
conservation of function over sequence, as the amino acid conservation of this region is
low within each of the four families and, therefore, the entire order. These result also
support the previous disorder prediction work done on Paramyxovirinae. For example, in
SENV the C-terminal amino acids, 401-524, contain the P-N binding site[9]; this region
lacks residue conservation among the other Paramyxoviruses but does correspond with
being a disordered region (Fig 2.2C) as observed previously(Jensen et. al 2008). NCDV
was previously shown to contain a region associating with P within the first 25 amino
acids of the N-terminus[53].
Similar to SENV this region lacks amino acid sequence
conservation but a trend of conserved disordered residues is apparent in that region
46
among the other Paramyxoviruses (Fig 2.2C). Additionally, in Newcastle Disease Virus
(NCDV) the C-terminal region at residues, 376-489, appear to be unnecessary when it
comes to forming an eleven-subunit ring of the nucleocapsid, suggesting this region
functions separately from the formation of the N-RNA structure [53]. Disorder prediction
for NCDV shows a long disordered region encompassing that 376-389aa region
highlighting a possible interaction site for the phosphoprotein (Fig 2C). This interaction
could be related to the transcription/translation process [53]. In MeV residues 477-505
have been recognized to interact with the phosphoprotein [54]. Further the disordered
region of the N-tail in MeV has been shown to bind to P even when isolated from all
other viral material [55](Longhi et. al 2003); suggesting a strong overall trend of disorder
for the family of Paramyxoviridae in this region (Karlin et. al 2003, Bourhis et. al 2006).
In Rhabdoviridae the trend is less neatly organized, as the divergence of these
sequence is more than that observed in the other families, but still highlights the
flexibility in the C-terminus. In addition to the C-terminal disorder observed in the other
families, a region within the first 20 amino acids of the Rhabdoviridae sequences in the
N-terminus is observed to contain disorder. In Lettuce Necrotic Yellow Virus (LNYV)
this disordered region is larger than the corresponding disorder predictions of the other
Rhabdoviruses, even the other Cytorhabdoviruses SCRV and Sonchus Yellow Net Virus
(SYNV) (Fig 2.2D). The region does correspond with the other N-terminal disordered
regions of smaller size in the other viruses. Interestingly earlier in our studies the Orchid
Fleck Virus (OFV) showed the closest match in size to this N-terminal disorder regions.
OFV had been classified as a tentative Rhabdovirus, but has since been removed due to
47
possessing a bipartite genome. OFV appears to go against the main trend of the other
Rhabdoviruses and the viral order by displaying a large disordered region in the Nterminus (results not shown). As OFV is not in the family any longer these results are
likely due to the existence of the OFV genome as bipartite negative-sense RNA that
could require some further flexibility in function/structure compared to the nonsegmented genomes. As LNYV is a single-stranded virus the similarity is either a
coincidence or an undetermined link.
Filoviridae displays a longer region of disorder in the C-terminus compared to the
other families (Fig 2.2B, 2.3). This larger disordered region may allow the protein to
maintain a similar conformation for the structural regions that are associated with RNA
genome.
The lack of conserved disorder within MARV compared to the three
Ebolaviruses in region 110-140 is of note (Fig 2.2B).
In support of the disorder
prediction from residue ~400-670 in the Ebolaviruses a study observed that the amino
acids 601-739 of the nucleoprotein were not required in the formation of the nucleocapsid
or replication of a shortened genome; as residues 670+ are predicted to contain secondary
structure it appears their function is unrelated to binding partner ligands (Fig 2.2B) [3].
BDV is so different from the rest it really does not group and this is illustrated by
the large disordered region in the N-terminus as compared to the majority of other viruses
(Fig 2.2A, 2.3). BDV does, however, contain a disorder C-terminal region and two
additional sequence regions of disorder that are congruent with the rest of the order (Fig
2.2A, 2.3).
48
Co-evolution and Intra-residue Contact
In evolution functional constraints are expected to limit the amino acid
substitution rates, resulting in a higher conservation of structural/functional sites with
respect to the rest of the protein. Once a residue is changed, given the constraints
operating on it, this mutation can be compensated with an additional mutation of
corresponding residues across the [inter-protein] interface. This enables the co-evolution
of two proteins that can lead to both high specificity and affinity. These properties can be
applied to interactions such as intra-protein residue-pairs stabilizing the protein fold,
inter-protein residue binding residues and protein–nucleic acid interactions[25-27]. The
results of two intra-residue contact predictors, ConSEQ and CORNET, and two
coevolving residue mutation predictors, XDET and CAPS, were combined into a
consensus of structural/ functional predictions. ConSEQ makes predictions by estimating
the rate of amino acid evolution at each position in a MSA of homologous proteins[39].
The underlying assumption of this approach is that, in general, structurally and
functionally important residues are slowly evolving. CORNET is a neural network-based
method using correlated mutations, sequence conservation, predicted secondary structure,
and evolutionary information[40, 41]. CAPS compares the correlated variance of the
evolutionary rates at two sites corrected by the time since the divergence of the protein
sequences[43]. XDET compares the mutational behavior of a residue position with the
mutational behaviors of the entire alignment, which assumes the positions showing a
family-dependent conservation pattern will have similar mutational behaviors as the rest
of the family[38, 42]. All these methods are combined into the CICP, which correlates
49
the structure and functional predictions with the residues that are constrained by intraprotein evolutionary pressures.
The concentration of CICPs correlates with the
evolutionary distances between the sequences used – the closer the evolutionary distances
within a region the higher the concentration of CICPs for that region given that it also
contains structural or functionally important residues.
As illustrated by the results in Figures 2.2C, 2.2D and 2.3 there are many residues
that are predicted to be CICPs throughout the nucleoprotein sequences. Many of these
residues also seem to be in contact within the protein as shown in Figures 2.5B, 2.5D and
2.6.
These CICPs are observed to be significantly lower in frequency within the N-
terminal portion of the nucleoproteins (Fig 2.3, 2.4). This absence is most likely linked
to this region being a part of the N:N interface, which would put these residues under
different evolutionary constraints of inter-protein interaction. A study of the PDPRV
nucleoprotein identified that residues 1-120 and 146-241 are required for the formation
and stability of the N:N interactions[56].
These residues needed for N:N stability
correlate with the absence of highly conserved concentrations of CICPs (Fig 2.2, 2.3,
2.4).
The majority of the CICPs fall in ~382-426 and ~447-522 within the entire order
(Fig 2.3, 2.4C), which corresponds, to residue ~286-323 and ~360-416 of
Paramyxoviridae (Fig 2.2C, 2.4A) and residue 351-367 and 431-473 of Rhabdoviridae
(Fig 2D, 4B). These regions are more conserved and contain more hydrophobic residues.
Combined with the high concentrations of CICPs these regions appear to be important for
intra-protein structural/functional interactions. While the C-terminal region has been
50
previously shown to interact with the phosphoprotein and the first ~240 residues of the
N-terminus are part of the N:N interface, the region ~382-426 and ~447-522(Fig 2.3,
2.4C) is well conserved containing both a high concentration of hydrophobic residues and
a high frequency of CICPs. Logically such constraint would be due to the intra-protein
structure and function, and possibly the interactions associated with encapsidating the
RNA. This region would have less flexibility to mutate and, therefore, be conserved
within the families. Contained within this region for SENV are residues 362-371, which
were identified by point mutations to be essential in RNA replication[57]. The
Paramyxovirinae show little pattern of correlation between CICPs and the concentration
of disorder in the N-terminus; however, there is an overlap of residues that are correlated
mutations and predicted disordered in the C-terminus residues ~546-547 of the MSA (Fig
2.3, 2.4C). This overlap suggests these residues may play a role in both the structure of
the nucleoprotein as indicated by the CICP but also involved in inter-protein interactions
at some time during the transcription/replication cycle and conformational changes that
may likely involve a binding ligand interaction with the phosphoprotein or polymerase.
Within the Vesiculoviruses, VSIV and Spring Viremia of Carp Virus (SVCV) (Fig 2D,
Table S1D) also display the disorder and CICP residue overlap and these residues fall
into a previously identified region within RABV from residues ~298-352 that was
experimentally shown to be involved in RNA binding [58]. The RABV residues 315-319
and SENV residues 364-369 are aligned in MSA supporting functional similarity for
RNA binding at this region. Further, MolProbity analysis reveals residues 287, 290, 291,
51
292, 312, 315 and 317 in VSIV N align within RABV residues 289-352 to be in contact
with the RNA (data not shown).
Structural Analysis
Based on the distribution of the large disordered regions of the C-terminus of the
nucleoproteins being at the fringes of the nucleocapsid-ring complex (Fig 2.5A, 2.5C) it
can be inferred that these disordered regions are responsible for interacting with other
nucleoproteins. When multiple units of these highlighted complexes are lined up it is
obvious that a large disordered region exists that could offer access to the RNA genome
encapsidated within. This disordered region could then also be defined as interfacing
with the phosphoprotein, which would likely be coupled with the L polymerase to
provide an interaction site for facilitating transcription or replication of the genome. This
hypothesis is further supported by a previous study that found the RABV N-RNA rings
had bound phosphoprotein on the tips of the rings when stained and visualized with
electron microscopy[59]. More recently, a crystal structure of the VSIV N:RNA & P
complex has been solved[22] and was used to examine the mapping of the predictions to
the identified binding regions in the Nucleoprotein (Fig 2.7). The results of the mapping
show that the predicted disordered region in the C-terminus is bound to the
phosphoprotein. Further, this binding region lacks CICPs calculated for the intra-protein
interactions. The presence of the disorder and absence of the intra-protein interactions in
the binding region supports what we would expect biologically and, therefore, we can
infer that similar characterization of the other proteins of the order Mononegavirales with
the same disorder and CICP predictions highlights their regions of interaction.
52
From the evidence of this study and the corroborating findings of individual viral
nucleoproteins from previous studies we can strongly infer that Rhabdoviridae and
Paramyxoviridae, and more generally the other viruses in Mononegavirales, have similar
functional/structural regions corresponding specifically to those regions showing
conservation in disorder and co-evolution even though they may have weak amino acid
sequence conservation. Specifically the C-terminal end of the nucleoprotein is predicated
to be involved with binding to the phosphoprotein in a manner important to
transcription/replication and not necessarily important to the formation of the
nucleocapsid for every virus evaluated in this study. Also, it appears that evolution has
constrained the function of some binding proteins not simply through sequence
conservation but through conserving regions to remain disordered. These disorder and
CICP residue presence and absence findings are validated by the existing experimental
and crystal structure information for RABV(Fig 2.5, 2.6) and VSIV(Fig 2.7). This
concordance provides confidence that the DisICC pipeline predications are valuable for
sequences currently without structural information such as MuV and NIPH that both
infect humans. The validation of the DisICC disorder predictions and presence or absence
of CICPs with previous structural and experimental observations support our ongoing
studies using predictive methods involving the other two proteins, P and L that make up
the transcription/replication complex.
The validation of this study by current structural information illustrates that the
combination
of
evolutionary
dynamics,
disorder
prediction,
intra-protein
structure/function predictions and co-evolving residue prediction provides the ability to
53
identify residues and regions important for protein-ligand interactions, intra-protein
interactions and protein-protein monomer interfaces. The DisICC pipeline uses sequence
information to characterize proteins by predicting the residues and regions that would be
necessary to target disruption in viruses that have little structural information available.
As more viruses are discovered, and epidemics occur, methods such as the DisICC
pipeline can quickly provide the information to aide researchers with response and
development of treatments without structural information on these new and emerging
viruses. For example, DisICC has the ability to produce information about protein
residue positions in emerging viral strains that would point to changes resulting from new
selective pressures providing researchers with possible regions to target as well as further
insight into viral evolutionary strategies. The information a method like DisICC provides
would also point to protein regions likely to remain unchanged as these viruses mutate
thereby indicating new targets in the development of longer lived treatments. DisICC can
also be applied to other multi-protein systems where identifying residues to disrupt
structural/functionally conserved residues and even possible ligand binding regions
without 3D structure information.
In summary, experimental and structural data validate a combined analytical
approach to predicting residues and regions important for protein-ligand interactions,
intra-protein interactions and protein-protein monomer interfaces. We have created the
DisICC pipeline to continue our studies on the structure/function of the three proteins
necessary for the replication/transcription complex of the order Mononegavirales. This
54
pipeline will also aid other researchers in inferring contacts among proteins complexes
when little structural information is available.
Materials and Methods
Phylogenetic Reconstruction
The multiple sequence alignments for each family were created by submitting the
sequences to the MAFFT ver.6 server (http://mafft.cbrc.jp/alignment/server/index.html)
using the E-INS-i strategy. Each family alignment was manually curated to ensure
optimal alignments. For the alignment of the entire order, each independent family
alignment were organized into one FASTA file and submitted to the MAFFT ver. 6
alignment server using the E-INS-i strategy[26]. The MSA output was then manually
curated due to the wide divergence of the sequences. This alignment was the input for
MrBayes3.1[29, 30] and BEASTv1.5.4 [60]for the generation of the phylogenetic trees.
The parameters used for MrBayes3.1 were a mixed amino acid model, eight category
gamma distribution rate, and 1,000,000 generations of the Markov Chain Monte Carlo
analysis. In our studies, constraints were designed from our knowledge of the family
classifications of the sequences resulting in four constraints. It should be noted that
although the constraint parameter was invoked for the trees MrBayes3.1 overrides any
constraint if the data do not support it. It has been previously explored that MrBayes3.1
with appropriate constraints, produced trees with higher confidence at each node than
other tree methods: neighbor-joining, minimum evolution, maximum parsimony, and the
un-weighted pair group method with arithmetic mean[61]. The outgroup used was BDV
55
due to its difference from the other families. The BEASTv1.5.4 tree was created using
two independent Bayesian MCMC chains (10 million steps, 10% burn-in) run under the
WAG amino acid substitution model[62] and rate heterogeneity among sites (four
category gamma distribution rate). Monophyletic taxon sets consisting of Filoviridae,
Rhabdoviridae and Paramyxoviridae were also used in the model. The following viral
proteins were included in the study: SEBOV, Sudan Ebola Virus (YP_138520.1);
ZEBOV,
Zaire
Ebola
Virus
(NP_066243.1);
REBOV,
Reston
Ebola
Virus
(NP_690580.1); MARV, Lake Victoria Marburgvirus (NP_042025.1); BDV, Borna
Virus (NP_042020.1); HMPNV, Human Metapneumovirus (YP_012605.1); AVPNV,
Avian Pneumovirus (AAT58236.1); HRSVB1, Human Respiratory Syncytial Virus B1
(NP_056858.1); HRSVA2, Human Respiratory Syncytial Virus A2 (P03418); HRSVS2,
Human Respiratory Syncytial Virus S2 (AAC57022.1); RSV, Respiratory Syncytial Virus
(NP_044591.1); BRSV, Bovine Respiratory Syncytial Virus (NP_048050.1); PNVM15,
Pneumonia Virus of Mice 15 (AAW02834.1); PNVMJ3666, Pneumonia Virus of Mice
J3666 (YP_173326.1); MuV, Mumps Virus (NP_054707.1); TIOV, Tioman Virus
(NP_665864.1); MENV, Menangle Virus (YP_415508.1); SPIV41, Simian Parainfluenza
Virus 41 (YP_138504.1); HPIV2, Human Parainfluenza Virus 2 (NP_598401.1); SPIV5,
Simian Parainfluenza Virus 5 (YP_138511.1); AVPMV6, Avian Paramyxovirus 6
(NP_150057.1); GPV, Goose Paramyxovirus SF02 (NP_872273.1); NCDV, Newcastle
Disease Virus (NP_071466.1); TUPV, Tupaia Paramyxovirus (NP_054690.1); FDLV,
Fer-de-lance Virus (NP_899654.1); NIPH, Nipah Virus (NP_112021.1); HV, Hendra
Virus (NP_047106.1); MOSV, Mossman Virus (NP_958048.1); BEIV, Beilong Virus
56
(YP_512244.1); JV, J Virus (YP_338075.1); CDV, Canine Distemper Virus
(NP_047201.1); PDV, Phocine Distemper Virus (CAA53376.1); DMV, Dolphin
Morbillivirus (NP_945024.1); PDPRV, Peste-des-petits-ruminants Virus (YP_133821.1);
MeV, Measles Virus (NP_056918.1); RPV, Rinderpest Virus (YP_087120.2); HPV1,
Human Parainfluenza Virus 1 (NP_604433.1); SENV, Sendai Virus (NP_056871.1);
BPV3, Bovine Parainfluenza Virus 3 (NP_037641.1); HPV3, Human Parainfluenza
Virus 3 (NP_067148.1); FLAV, Flanders Virus (AAN73283.1); BEFV, Bovine
Ephemeral Fever Virus (NP_065398.1); SCRV, Siniperca Chuatsi Rhabdovirus
(YP_802937.1); ISFV, Isfahan Virus (Q5K2K7); CHPV, Chandipura Virus (P11211);
SVCV, Spring Viremia of Carp Virus (NP_116744.1); VSNJV, Vesicular Stomatitis New
Jersey Virus (P04881); VSIV, Vesicular Stomatitis Indiana Virus (NP_041712.1);
VSSJV, Vesicular Stomatitis San Juan Virus (P03521); ABLV, Australian Bat
Lyssavirus (NP_478339.1); RABV, Rabies Virus (NP_056793.1); MOKV, Mokola
Lyssavirus (YP_142350.1); NCMV, Northern Cereal Mosaic Virus (NP_057954.1);
LNYV, Lettuce Necrotic Yellows Virus (YP_425087.1); SYNV, Sonchus Yellow Net
Virus (NP_042281.1); MFSV, Maize Fine Streak Virus (YP_052843.1); RYSV, Rice
Yellow Stunt Virus (NP_620496.1); MMV, Maize Mosiac Virus (YP_052850.1); TVCV,
Taro Vein Chlorosis Virus (YP_224078.1); SNAKV, Snakehead Virus (NP_050580.1);
VHSV, Viral Hemorrhagic Septicemia Virus (NP_049545.1); HIRV, Hirame Virus
(NP_919030.1); IHNV, Infectious Hematopoietic Necrosis Virus (NP_042676.1)
57
Disorder
Disorder calculations were performed using PONDR, IUPred [35, 36],
DisoPRED2 [38] and DisEMBL [37] prediction programs. PONDR was run under the
default setting and the VX-LT results were used.
IUPred was run under the long
sequence default settings. DisEMBL was run using default settings and the Hot-loop and
Coil results were both included in our evaluation. DisoPRED2 was run under default
setting. All the disorder prediction results from these methods were normalized to a 0-1
scale of disorder with values of 0.5 and greater indicating the tendency of a residue to be
considered disordered. These normalized values were then combined and averaged to a
consensus value using the same scale. This calculated value is used as the overall
indicator for the prediction of disorder in the results.
It should be noted that this
consensus method provides an overall conservative prediction of disorder revealing
residues with high probability of disorder and preventing over-prediction.
Correlated Mutations and
Intra-Residue Contact Prediction
The correlated mutation prediction programs used in this study were XDET[38,
42] and CAPS[43] and the intra-residue contact prediction programs implemented were
ConSEQ[39] and CORNET[40, 41]. The input files for these applications were generated
by calculating the pair wise percent identities within each family. MSAs of nucleoprotein
amino acid sequence with less than 90% sequence identity but greater than 19% were
used in the analyses. XDET, CAPS and CORNET were both run under the default
parameters and ConSEQ used all defaults except the “amino acid conservation method”
58
was set to Bayesian. The resulting predictions from each program were combined and
any residues that showed a positive agreement of three or more predictors was classified
as a CICP. Calculation of conservation of CICPs within the alignments is calculated per
alignment position by summing up the CICP occurrences per column and dividing by the
total number of sequences that participated in the CICP study for that alignment.
Hydrophobic Residues and MSA Conservation
The correlation of residues in the MSAs that contained hydrophobic residues
and/or high MSA sequence conservation was studied using Jalview [44].
Jalview
provides visualization of hydrophobicity and sequence conversation. Conservation
annotation scores were then compared with hydrophobicity for the MSA residues that
displayed CICPs.
Structural Analysis
The validity of the predictions of disorder and correlated mutations were
corroborated against structural information.
The existing crystal structure for the
nucleoprotein complex of RABV (pdb id - 2GTT) was selected for comparison. The
amino acid sequence information from the protein database file was extracted for
individual nucleoprotein subunits and aligned with the corresponding amino acid
sequence used in the predictions. The aligned positions were then used to map the
appropriate prediction to the crystal structure with a color to highlight the corresponding
residue.
Chimera[46] used the prediction and alignment information to create the
highlighted pdb images.
59
To explore predicted features that may point to protein-protein interaction the
crystal structure of the VSIV N:RNA & P complex (pdb id – 3HHZ) was used. The
nucleoproteins in the complex were mapped using the same method as above. MolProbity
all-atom-contact analysis [47] was conducted to verify interacting residues between the N
and P proteins, and RNA interactions. The results were compared with the disordered
residues and those residues reported to be in contact between N and P were reported.
Acknowledgements
We thank Jacques Perrault, Jonathan Hilmer and Melissa Robertson for critical review of
the manuscript.
60
References
1. Horie M, Honda T, Suzuki Y, Kobayashi Y, Daito T, et al. (2010) Endogenous nonretroviral RNA virus elements in mammalian genomes. Nature 463: 84-87.
2. Becker S, Huppertz S, Klenk H, Feldmann H (1994) The nucleoprotein of Marburg
virus is phosphorylated. J Gen Virol 75: 809-818.
3. Watanabe S, Noda T, Kawaoka Y Functional Mapping of the Nucleoprotein of Ebola
Virus -- Watanabe et al. 80 (8): 3743 -- The J Virol. J Virol 80: 3743-3751.
4. Chuang J, Perrault J (1997) Initiation of vesicular stomatitis virus mutant polR1
transcription internally at the N gene in vitro. J. Virol. 71: 1466-1475.
5. Lichty BD, Power AT, Stojdl DF, Bell JC (2004) Vesicular stomatitis virus: reinventing the bullet. Trends in Molecular Medicine 10: 210-216.
6. Erik Johnson J, Coleman JW, Kalyan NK, Calderon P, Wright KJ, et al. (2009) In
vivo biodistribution of a highly attenuated recombinant vesicular stomatitis virus
expressing HIV-1 Gag following intramuscular, intranasal, or intravenous
inoculation. Vaccine 27: 2930-2939.
7. Koser ML, McGettigan JP, Tan GS, Smith ME, Koprowski H, et al. (2004) Rabies
virus nucleoprotein as a carrier for foreign antigens. Proceedings of the National
Academy of Sciences of the United States of America 101: 9405-9410.
8. Whelan S, Barr J, Wertz G (2004) Transcription and replication of nonsegmented
negative-strand RNA viruses. Current Topics in microbiology and immunology 283:
61-119.
9. Cevik B, Kaesberg J, Smallwood S, Feller JA, Moyer SA (2004) Mapping the
phosphoprotein binding site on Sendai virus NP protein assembled into
nucleocapsids. Virology 325: 216-224.
10. Chuang JL, Jackson RL, Perrault J (1997) Isolation and Characterization of Vesicular
Stomatitis Virus PolR Revertants: Polymerase Readthrough of the Leader-N Gene
Junction Is Linked to an ATP-Dependent Function. Virology 229: 57.
11. Murphy LB, Loney C, Murray J, Bhella D, Ashton P, et al. (2003) Investigations into
the amino-terminal domain of the respiratory syncytial virus nucleocapsid protein
reveal elements important for nucelocapsid formation and interaction with the
phophoprotein. Virology 307: 143-153.
61
12. Moyer SA, Smallwood-Kentro S, Haddad A, Prevec L (1991) Assembly and
Transcription of Synthetic Vesicular Stomatitis
Virus Nucleocapsids. J Virol 65: 2170-2178.
13. Takacs AM, Barik S, Ban AK (1992) Phosphorylation of specific serine residues
within the acidic domain of the phosphoprotein of vesicular stomatitis virus regulates
transcription in vitro. J Virol 66: 5842-5848.
14. Howard M, Wertz GW (1989) Vesicular Stomatitis Virus RNA Replication: a Role
for the NS Protein. Journal of General Virology 70: 2683-2694.
15. La Ferla FM, Peluso RW (1989) The 1:1 N-NS Protein Complex of Vesicular
Stomatitis Virus Is Essential for Efficient Genome Replication. J Virol 63: 35823857.
16. Stokes HL, Easton AJ, Marriott AC (2003) Chimeric pneumovirus nucleocapsid (N)
proteins allow identification of amio acids essential for the function of the repiratory
syncytial virus N protein. Journal of General Virology 84: 2679-2683.
17. Meric C, Spehner D, Mazarin V (1994) Respiratory syncytial virus nucleocapsid
protein (N) expressed in insect cells forms nucleocapsid-like structures. Virus
Research 31: 187-201.
18. Green TJ, Zhang X, Wertz GW, Lou M (2006) Crystal Structure of Vesicular
Stomatitis Virus Nucleoprotein-RNA Complex. Science 313: 357-360.
19. Tawar RG, Duquerroy S, Vonrhein C, Varela PF, Damier-Piolle L, et al. (2009)
Crystal Structure of a Nucleocapsid-Like Nucleoprotein-RNA Complex of
Respiratory Syncytial Virus. Science 326: 1279-1283.
20. Zhang X, Green TJ, Tsao J, Qiu S, Luo M (2008) Role of Intermolecular Interactions
of Vesicular Stomatitis Virus Nucleoprotein in RNA Encapsidation. J Virol 82: 674682.
21. Luo M, Green TJ, Zhang X, Tsao J, Qiu S (2007) Structural comparisons of the
nucleoprotein from three negative strand RNA virus families. Virology Journal 4: 17.
22. Green TJ, Zhang X, Wertz GM, Luo M (2006) Structure of the Vesicular Stomatitis
Virus Nucleoprotein-RNA Complex. Science 313: 357-360.
23. Tompa P (2002) Intrinsically unstructured proteins. Trends in Biochemical Sciences
27: 527-533.
62
24. Tsai C, Ma B, Sham YY, Kumar S, Nussinov R (2001) Structured disorder and
conformational selection. Proteins: Structure, Function, and Genetics 44: 418-427.
25. Pazos F, Helmer-Citterich M, Ausiello G, Valencia A (1997) Correlated mutations
contain information about protein-protein interaction. Journal of Molecular Biology
271: 511-523.
26. Pollock DD, Taylor WR, Goldman N (1999) Coevolving protein residues: maximum
likelihood identification and relationship to structure. Journal of Molecular Biology
287: 187-198.
27. Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW (2002) Evolutionary
Rate in the Protein Interaction Network. Science 296: 750-752.
28. Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence
alignment program. Brief Bioinform 9: 286-298.
29. Huelsenbeck JP, Ronquist F, Nielsen R, Bollback JP (2001) Bayesian Inference of
Phylogeny and Its Impact on Evolutionary Biology. Science 294: 2310-2314.
30. Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference
under mixed models. Bioinformatics 19: 1572-1574.
31. Assenberg R, Delmas O, Morin B, Graham S, De Lamballerie X, et al. Genomics and
structure/function studies of Rhabdoviridae proteins involved in replication and
transcription. Antiviral Research In Press, Corrected Proof. Available at:
http://www.sciencedirect.com/science/article/B6T2H-4YG7JR13/2/c01c2b34a5acfca598f9b575a7a052a7.
32. Romero P, Obradovic Z, Dunker AK (197) Sequence data analysis for long
disordered regions prediction in the calcineurin family. Genome Informatics 8: 110124.
33. Romero P, Obradovic Z, Li X, Garner E, Brown C, et al. (2001) Sequence
complexity of disordered protein. Proteins: Structure, Function, and Bioinformatics
42: 38-38.
34. Li X, Romero P, Rani M, Dunker AK, Obradovic Z (1999) Predicting protein
disorder for N-, C-, and internal regions. Genome Informatics 10: 30-40.
35. Dosztanyi Z, Csizmok V, Tompa P, Simon I (2005) The Pairwise Energy Content
Estimated from Amino Acid Composition Discriminates between Folded and
Intrinsically Unstructured Proteins. Journal of Molecular Biology 347: 827-839.
63
36. Dosztanyi Z, Csizmok V, Tompa P, Simon I (2005) IUPred: web server for the
prediction of intrinsically unstructured regions of proteins based on estimated energy
content. Bioinformatics 21: 3433-3434.
37. Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, et al. (2003) Protein Disorder
Prediction: Implications for Structural Proteomics. Structure 11: 1453-1459.
38. del Sol Mesa A, Pazos F, Valencia A (2003) Automatic Methods for Predicting
Functionally Important Residues. Journal of Molecular Biology 326: 1289-1302.
39. Berezin C, Glaser F, Rosenberg J, Paz I, Pupko T, et al. (2004) ConSeq: the
identification of functionally and structurally important residues in protein sequences.
Bioinformatics 20: 1322-1324.
40. Olmea O, Valencia A (1997) Improving contact predictions by the combination of
correlated mutations and other sources of sequence information. Folding and Design
2: S25-S32.
41. Fariselli P, Casadio R (1999) A neural network based predictor of residue contacts in
proteins. Protein Eng. 12: 15-21.
42. Pazos F, Rausell A, Valencia A (2006) Phylogeny-independent detection of
functional residues. Bioinformatics 22: 1440-1448.
43. Fares MA, McNally D (2006) CAPS: coevolution analysis using protein sequences.
Bioinformatics 22: 2821-2822.
44. Waterhouse AM, Procter JB, Martin DMA, Clamp M, Barton GJ (2009) Jalview
Version 2--a multiple sequence alignment editor and analysis workbench.
Bioinformatics 25: 1189-1191.
45. Albertini AAV, Wernimont AK, Muziol T, Ravelli RBG, Clapier CR, et al. (2006)
Crystal Structure of the Rabies Virus Nucleoprotein-RNA Complex. Science 313:
360-363.
46. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, et al. (2004)
UCSF Chimera—A visualization system for exploratory research and analysis. J.
Comput. Chem. 25: 1605-1612.
47. Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, et al. (2007) MolProbity:
all-atom contacts and structure validation for proteins and nucleic acids. Nucl. Acids
Res. 35: W375-383.
48. McCarthy AJ, Goodman SJ (2010) Reassessing conflicting evolutionary histories of
64
the Paramyxoviridae and the origins of respiroviruses with Bayesian multigene
phylogenies. Infection, Genetics and Evolution 10: 97-107.
49. Bowden TR, Westenberg M, Wang L, Eaton BT, Boyle DB (2001) Molecular
Characterization of Menangle Virus, a Novel Paramyxovirus which Infects Pigs,
Fruit Bats, and Humans. Virology 283: 358-373.
50. Kurath G, Batts WN, Ahne W, Winton JR (2004) Complete Genome Sequence of
Fer-de-Lance Virus Reveals a Novel Gene in Reptilian Paramyxoviruses. J. Virol. 78:
2045-2056.
51. Miller PJ, Boyle DB, Eaton BT, Wang L (2003) Full-length genome sequence of
Mossman virus, a novel paramyxovirus isolated from rodents in Australia. Virology
317: 330-344.
52. Li Z, Yu M, Zhang H, Magoffin DE, Jack PJ, et al. (2006) Beilong virus, a novel
paramyxovirus with the largest genome of non-segmented negative-stranded RNA
viruses. Virology 346: 219-228.
53. Kho CL, Tan WS, Tey BT, Yusoff K (2004) Regions on nucleocapsid protein of
Newcastle disease virus that interact with its phosphoprotein. Archives of Virology
149: 997-1005.
54. Kingston RL, Hamel DJ, Gay LS, Dahiquist FW, Matthews BW Structural basis for
the attachment of a paramyxoviral polymerase to its template. PNAS 101: 83018306.
55. Bourhis J, Johansson K, Receveur-Brechot V, Oldfield CJ, Dunker KA, et al. (2004)
The C-terminal domain of measles virus nucleoprotein belongs to the class of
intrinsically disordered proteins that fold upon binding to their physiological partner.
Virus Research 99: 157-167.
56. S.C. Bodjo, M. Lelenta, E. Couacy-Hymann, O. Kwiatek, E. Albina, et al. (2008)
Mapping the Peste des Petits Ruminants virus nucleoprotein: Identification of two
domains involved in protein self-association. Virus Research 131: 23-32.
57. Myers TM, Smallwood S, Moyer SA (1999) Identification of nucleocapsid protein
residues required for Sendai virus nucleocapsid formation and genome replication. J
Gen Virol 80: 1383-1391.
58. Kouznetzoff A, Buckle M, Tordo N (1998) Identification of a region of the rabies
virus N protein involved in direct binding to the viral RNA. J Gen Virol 79: 10051013.
65
59. Schoehn G, Iseni F, Mavrakis M, Blondel D, Ruigrok RWH (2001) Strucuture of
Recombinant Rabie Virus Nucleoprotein-RNA Complex and Identification of the
Phosphoprotein Binding site. J Virol 75: 490-498.
60. Drummond A, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by
sampling trees. BMC Evolutionary Biology 7: 214.
61. Basta HA, Cleveland SB, Clinton RA, Dimitrov AG, McClure MA (2009) Evolution
of Teleost Fish Retroviruses: Characterization of New Retroviruses with Cellular
Genes. J. Virol. 83: 10152-10162.
62. Whelan S, Li P, Goldman N (2001) Molecular phylogenetics: state-of-the-art
methods for looking into the past. Trends Genet 17: 262-272.
63. Drummond AJ, Ho S, Phillips M, Rambaut A (2006) Relaxed phylogenetics and
dating with confidence. PLoS Biology 4: e88.
66
DISORDER, INTRA-RESIDUE CONTACT AND COEVOLUTION PREDICTION OF
THE LARGE SUBUNIT POLYMERASE AND PHOSPHOPROTEIN FOR THE
ORDER MONONEGAVIRALE USING THE DISICC PIPELINE
Contribution of Authors and Co-Authors
Manuscript in Chapter 3
Author: Sean B. Cleveland
Contributions: Conceived and designed the experiments, performed the experiments,
analyzed the data and wrote the paper.
Co-Author: Marcella A. McClure
Contributions: Conceived and designed the experiments, analyzed the data and wrote the
paper.
67
Manuscript Information Page
Sean B. Cleveland, Marcella A. McClure
PeerJ
Status of Manuscript: (Put an x in one of the options below)
__X_ Prepared for submission to a peer-reviewed journal
____ Officially submitted to a peer-review journal
____ Accepted by a peer-reviewed journal
____ Published in a peer-reviewed journal
PeerJ
68
Abstract
The goal of this Bioinformatic study is to investigate sequence conservation in
relation to evolutionary function/structure of the large subunit polymerase protein (L) and
Phosphoprotein (P) of the Order Mononegavirales. In the combined analysis of 63
representative L and P protein sequences from four viral families (Paramyxoviridae,
Rhabdoviridae, Filoviridae, and Bornaviridae) I predict the regions of protein disorder,
intra-residue contact and co-evolving residues using my Disorder, Intra-residue contact
and Compensatory mutation Correlator, called the DisICC pipeline. Correlations between
location and conservation of predicted regions illustrate a strong division between
families while highlighting conservation within individual families. These results suggest
that L Domains are conserved across the Order with strong intra-sequence pressures for
conservation, while the hinge regions lack these pressures. Conserved disorder is reported
for: the amino-terminal of L for L-L complex formation across all families, Domain V for
capping activity across Paramyxovirine and Vesiculovirus, and Domain VI for cap
methylation is conserved across Paramyxovirinae, Rubulaviruses, Avulaviruses,
Ferlavirus and Morbilliviruses. The P sequences show a strong conservation of disorder
within viral families that corresponds to their binding Domains.
Introduction
The viruses of the Order Mononegavirales infect numerous species and have few
treatments. The Centers for Disease Control and Prevention have included the Ebola and
Marburg viruses, both negative-strand RNA viruses belonging to the Order
69
Mononegavirales, in their list of Bioterrorism Agents/Diseases, however, structural
knowledge of these agents is limited. In contrast to the concern of outbreaks of some of
these viruses, Vesicular Stomatitis virus (VSV) and Rabies virus are being used in
therapies for cancer and experimental vaccines against Human Immunodeficiency Virus
and influenza [1-3].
The viral Order, Mononegavirales, is composed of four families: Bornaviridae
contains the Borna Disease Virus (BDV), which affects the nervous system and the brain
in many animals, including cows and rats, and endogenous borna-like nucleoprotein
elements sequences exist within the human genome [4].
Paramyxoviridae includes
Sendai Virus (SENV), which typically affects rats and mice, and two viruses that cause
childhood epidemics, Measles Virus (MeV) and Mumps Virus (MuV). Filoviridae has
only two members, Ebolavirus and Marburgvirus that cause hemorrhagic fevers with
mortality rates up to 90% in humans [5,6]. The Rhabdoviridae contains Rabies Virus
(RABV) and VSV, which are both able to pass from their animal hosts to cause disease in
humans, as do many Mononegavirales. VSV is the model for the Rhabdoviridae family,
and the prototype for most of the investigation of transcription and replication for the
entire Order of Mononegavirales [7].
Negative-strand RNA viruses are unique in that their RNA genomes are always
encapsidated by a viral coded nucleoprotein to form a ribonucleoprotein (RNP) complex.
This complex serves as the template for viral RNA synthesis and forms the structural core
of the viruses when packaged into virions [8]. The RNP is formed concurrently with
transcription/replication by the viral RNA-dependent RNA polymerase (RdRp). For all of
70
Mononegavirales, the RdRp complex is composed of the negative-sense RNA genome
and three proteins: nucleoprotein (N), phosphoprotein (P) and the large subunit
polymerase protein (L). The RNA genome of this complex is always found associated
with the nucleoprotein as the RNP. This structure is resistant to nucleases, even during
synthesis [7,9]. The nucleoprotein, not only important for the encapsidation of the RNA
for transcription, has also been identified in interactions with itself, the large subunit
polymerase protein and phosphoprotein for the generation of mRNAs in protein
expression [10].
The majority of our knowledge of the activities and functions of the L polymerase
protein of Mononegavirales comes from studying VSV, the prototypic virus for the
Order.
L has six conserved regions that are shared among all L protein in
Mononegavirales as determined by multiple sequence alignment analysis [11]. Domain I
in SENV interacts with P and the P-N0 complex, which is P bound to nascent N that is
un-polymerized or not bound to RNA, during encapsidation of nascent RNA during
replication. Conserved charged motifs that play a role in template binding have been
identified in Domain II of SENV [12,13]. The RdRP activity is contained in Domain III
which has clearly identifiable motifs found in all polymerases and is also responsible for
polyadenltation [14,15]. Domain IV is poorly characterized but has been shown to affect
replication and transcription [16]. The mRNA capping activity is located in Domain IV
[17] and it has also been shown to act as a polyribonucleotidyltransferase [18]. The
produced mRNA cap is methylated by a dual specificity methyltransferase in Domain VI
[19] These Domains influence one another functionally, as a failure to cap the nascent
71
RNA chain results in the premature termination of transcription, and blocking
methylation results in hyperpolyadenylation. This demonstrates that the 5′ mRNA
processing activities of L intimately regulate its nucleotide polymerization activity and
suggest that the 3D arrangement of the functional Domains likely serves a key regulatory
role during RNA synthesis [20].
Currently, there are no crystal structures or NMR data available for the entire L or
any region of L. However, the use of negative stain electron microscopy has obtained a
molecular view of L, both alone and in complex with P [20].
This analysis combined
with proteolytic digestion and deletion mapping provides evidence for the organization of
L into a ring Domain containing the RNA polymerase (residues 1-1114, Domain 1-IV)
and an appendage of three globular Domains. The enzyme for capping was mapped to
one of the globular Domains (Domain V), which is juxtaposed to the ring, and the cap
methyl-transferase (Domain VI) maps to a more distant and flexible globule. When
bound to P, the L protein was shown to undergo a rearrangement for assuming the
optimal positioning of the functional Domains required for transcription [20].
The P protein is an essential cofactor for the RdRp activity of L as both proteins are
required to recognize the N-RNA template [21]. In Filoviridae the counterpart to the
phosphoprotein is VP35 [22] but will be referenced as P in this study. The P protein of
both Rhabdoviridae and Paramyxoviridae families are oligomers [23,24]. In
Rhabdoviridae P forms dimers, whereas the P of the Paramyxoviridae forms tetramers
[25]. Studies of RABV show that P contains a dimerization Domain located between
residues 91 and 131 [26,27]. Both structural and biochemical analyses suggest that
72
RABV P forms elongated dimers in solution, which supports the importance of
dimerization in replication [26]. In addition, P binds two distinct forms of N, a non-RNA
associated N and N-RNA complex. There is evidence that N° proteins, which are the
newly formed N proteins that are unassociated with other N or RNA, interact with two
distinct regions of P located at the amino-terminus in RABV at residues 4 to 40 [28,29]
and carboxy-terminus in RABV P at residues 185 to 297 [28,30]. In contrast, only the
carboxyl-terminal Domain of P interacts with N when it is bound to genomic RNA in the
nucleocapsid [27]. Experimental evidence obtained for VSV further suggests that the
differential phosphorylation of the amino and carboxyl-termini of P may be involved in
regulating viral transcription and replication [31]. Binding of P to the N-RNA complex
probably involves the carboxyl-terminal region of N (RABV N residues 376 to 450) [32].
A recent model proposes that during replication the L protein forms a complex with P,
which in turn binds to the N-RNA polymer and acts as a bridge to allow access of L to
the RNA. The model further suggests that the P-N° complex may bind to the replicating
L-P-N-RNA complex and feed the newly formed RNA strand with uncomplexed N for
immediate encapsidation [33].
Although, there is little structural data about the
replication/transcription complex, recent structures of the VSV and RABV N-RNAs have
been determined [34,35], as well as individual Domains of the VSV and RABV P
proteins, specifically the dimerization Domain of VSV P and the N-RNA binding
Domains of VSV and RABV P [36-38]. Further, the structure of VSV N-RNA
complexed with the carboxyl-terminal Domain of VSV P has been determined, which
reveals that P binds in the cleft between two adjacent N molecules in the nucleocapsid
73
[39]. The structure of the RABV P protein N-RNA binding Domain combined with
previous work on P-N complexes generated an initial model for the interaction between
RABV N-RNA and P [33,40]. In BDV, P has been shown to interact with the X protein
to regulate polymerase activity [41]. The X protein is a nonstructural protein 87 amino
acids in length [42,43] and its expression has been shown to be tightly regulated by
translational and transcriptional mechanisms [44,45]. The X protein is an important
regulator for viral RNA synthesis and polymerase complex assembly [46] and
recombinant viruses encoding an inactivated X gene or an X protein without a functional
P-binding domain were shown to be not viable [41].
In the studies presented here, the DisICC pipeline [47] was used to produce
models of the individual disorder and intra-protein contacts for the large subunit L
polymerase and P, phosphoprotein. Protein sequence information is used by DisICC to
produce the correlated results of protein disorder prediction, correlated mutations,
sequence conservation, and intra-residue prediction methods to characterize the L protein.
The purpose of evaluating the regions of disorder within a protein is to elucidate areas
that are observed to be binding sites for protein-ligand interactions. Upon association
with the partner ligand, the protein assumes a secondary structure as observed using x-ray
crystallography [23, 24]. The flexibility that disorder imparts allows these proteins to
have multiple binding partners as well as multiple functions based upon confirmation.
Since the large subunit polymerase protein, L interacts with the phosphoprotein, P and
nucleoprotein, N it is likely these regions of interaction will be indicated by disorder
prediction methods. The application of correlated mutation and intra-protein contact
74
predictors assume that evolutionary functional constraints are expected to limit the amino
acid substitution rates, resulting in a higher conservation of structural/functional sites
with respect to the rest of the protein. Once a residue is changed, given the constraints
operating on it, this mutation can be compensated with an additional mutation of a
corresponding residue elsewhere in the protein that may be in close proximity when
folded to maintain the interaction. This enables the co-evolution of the two residues that
can lead to both high specificity and affinity. These assumptions can be expanded to
include inter-protein residue pairs as well as protein–nucleic acid interactions [25-27].
The knowledge of these important residues aids in modeling protein structures when
combined with additional information derived from the disorder prediction and sequence
conservation. The resulting predictions provide sites that can be pursued for single point
mutation and inhibition analysis in the laboratory within the L and P proteins to interfere
with viral transcription/replication.
Results
Disorder Prediction
To identify potential residues that could be involved in inter-protein binding,
protein disorder prediction programs were applied to the L-polymerase and P sequences
and combined into a consensus prediction. The results of the four disorder predictions
programs (PONDR Fit [32-34], IUPred [35, 36], DisEMBL [37], and RONN [48]) were
normalized and averaged for each amino acid residue of the L and P sequences into a
consensus prediction value. Those values were mapped onto the Multiple Sequence
75
Alignments (MSAs) of each of the four viral families’ L polymerase and P proteins to
observe if there is any pattern in the location of disordered regions (Fig 3.1, Fig 3.2, Fig
3.3).
L Disorder Predictions
The Paramyxoviridae display a pattern of disorder conserved amongst the subfamilies confined to three regions (Fig 3.1A, Fig S3.1A, Table S3.1A). The regions that
display disorder conservation for the Paramyxoviridae are at MSA positions 26-41, 736856, 1447-1473, and most of these amino acids are conserved at greater than 30% for
disorder (Fig 3.1A, Fig 3.1A, Table S3.1A). The first disordered region (26-41) falls in
the amino-terminal of the L protein before any of the conserved Domains. The largest
conserved region of disorder (736-856) lies between Domain II and Domain III in the
MSA. The second largest region (1447-1473) is located in Domain V in the MSA.
Rhabdoviridae show low family conservation of disorder. Only a small region greater
than 50% conservation is found at the start of the amino-terminus at positions 10-24 in
the MSA (Fig 3.1B, Fig S3.1B, Table S3.1B). A few short disordered regions stand out
with greater than 30% conservation at positions 615-622, 2190-2203 in the MSA. Region
615-622 falls right before Domain II, while region 2190-2203 falls at the end of Domain
VI. The analysis of Filoviridae sequences reveal 5 disordered regions longer than 7
consecutive residues, conserved throughout the L protein at positions 8-15, 692-709, 765775, 1466-1481, 1758-1915 in the MSA (Fig 3.1C, Fig S3.1C, Table S3.1C). The first
region (8-15) falls at the amino-termini of the MSA.
76
A
B
C
D
Figure 3.1 Disorder and CICP mapped residues of Family MSAs for L. A.)
Paramyxoviridae B.) Rhabdoviridae C.) Filoviridae D.) Bornaviridae Each family was
aligned according to the process outlined in the methods section. Each residue is
represented by a colored column tick corresponding to Disorder, CICP, both Disordered
and CICP or neither a CICP or Disordered residue. Disordered residues are colored by an
increase from yellow, lowest confidence of disorder, to red, highest confidence of residue
disorder. CICPs are in blue. Residues predicted to be both Disordered and a CICP are in
green. Residues that have neither a Disorder or CICP prediction are in grey. Gaps in the
alignment are represented in white. The black ticks at the bottom of the alignment denote
every 25 column positions. The boxes above the alignments correspond to the conserved
Domains as described in the text: I (green), II (blue), III (orange), IV (red), V (yellow)
and VI (purple). The color-coded brackets to the left of the alignment indicate virus
families: Paramyxoviridae blue, Rhabdoviridae red, Filoviridae orange, and
Bornaviridae green. The families are further broken into genus. Unclassified viruses are
denoted by stars color-coded for the family considered to be the closest phylogenetic
relative.
77
Figure 3.2. Disorder and CICP mapped residues for the entire Order MSA of L. See
Figure 3.1 for description of the annotations.
The regions at 692-709 and 765-775 fall within Domain III. The fourth disordered region
(1466-1481) falls in the first half of the hinge region between Domain V and VI. The
largest concentration of conserved disordered is contained in one contiguous region,
1758-1915 (1758-1796, 1817-1826, 1829-1835, 1837-1846, 1856-1883, and 1905-1915)
that falls between Domains V and VI in the later half of the hinge region (Fig 3.1C, Fig
S3.1C, Table S3.1C). Bornaviridae display discreet, short regions of disorder at residues
1-2, 4-5, 755-760,1102-1108 and 1445-1448 (Fig 3.1D, Table S3.1D).
The first
disordered region (3-7) appears in the amino-terminal; the second (757-762) in Domain
III; the third (1102-1108) right after Domain V; and the fourth (1445-1448) after Domain
VI. The conservation of disorder across the entire Order is weak but there are three short
regions that reach greater than 30% conservation and one region in the amino-terminus
that is greater than 50% conservation (Fig 3.2, Fig S1E). The region with greater than
50% disorder conservation is at MSA position 29-31. The short regions, 5 or more
78
residues in length, of disorder that are greater than 30% conserved are at MSA positions
842-846, 1649-1657 and 1659-1663.
Positions 842-845 fall in the region between
Domains II and III. The disorder at positions 1649-1656 and 1659-1663 fall in to
Domain V and the latter contains part of the conserved capping motif GxxT[n]HR [49].
P Disorder Predictions
Paramyxovirinae P sequences were modified by removing the dispensable region
for transcription and replication, bringing them to a length compatible with the rest of the
Order for alignment [50]. Paramyxoviridae contain a high conservation of disorder
(greater than 50%) in the amino-terminus at positions 11-27, 72-98 and 104-121; and one
region in the carboxyl-terminus at 309-329 in the family MSA (Fig 3.3A, Fig S3.2A,
Table S3.2A). All of the high conservation regions of disorder in the amino-terminus fall
within the oligomerization Domain.
The carboxyl-terminal region of high disorder
conservation (309-329) falls between the L binding Domain and the N-RNA binding
Domain of Sendai. Additional disordered regions that are conserved in greater than 30%
of the MSA are at positions 129-138, 237-242, 279-293 and 351-359. These lower
conserved regions fall in the oligomerization Domain, L binding Domain, and the region
between the L binding Domain and the N-RNA binding Domain, respectively. The
Rhabdoviridae show strong family conservation for the amino-terminal region for
conserved disorder in P proteins (Fig 3.3B, Fig S3.2B, Table S3.2B). The strongest
region of conservation (greater than 50%) at MSA position 75-109. Additionally, a few
disordered regions stand out with greater than 30%-40% disorder conservation at
positions 18-35, 37-58, 176-184, 231-245, and 247-252. The regions at positions 18-35
79
and 37-55 map to the N0 binding region experimentally determined in RABV [29] and
VSV [51,52]. The second highly conserved region (75-109) falls between the N0 binding
Domain and the L binding Domain of VSV [53]. The 30-40% conserved regions fall into
the oligomerization Domain (176-184) of VSV [35] and the N-RNA binding Domain
(231-245, 247-352) of both RABV [30] and VSV [24].
Filoviridae contain
approximately seven regions of conserved disorder (Fig 3.3C, Fig S3.2C, Table S2C).
The first conserved disorder appears in the amino-terminus at positions 15-29, 55-67, 7187, 178-194 and 204-208 in the MSA. These regions all fall in the oligomerization
Domain in the MSA. The other regions of disorder that are conserved at greater than
50% appear at positions 324-338 and 352-358. These fall in the interferon inhibitory
Domain of Reston Ebola virus (REBOV) in the MSA. Slightly over one half (53.4%) of
the Bornaviridae P protein shows disorder (Fig 3.3D, Fig S3.2D, Table S3.2D). There
are two regions of disorder at residues 1-75, and 172-202.
The first highly conserved
disorder region (1-75) maps to the first half of the X binding region (33-115). The last
disordered region falls in the N binding region. The results of disorder prediction for the
P protein for the Order indicate locations of highly conserved disorder in the MSA (Fig
3.3E, Fig S3.2E). The longest region of highest conservation appears at position 76-103
in the MSA at greater than 50% conservation. This region maps to the oligomerization
domain for Paramyxoviridae and Filoviridae, the N0 binding Domain of Rhabdoviridae
and the X binding Domain in Bornaviridae.
80
A
B
C
D
E
Bornaviridae
X binding
Oligomerization
N binding
Interferon inhibitory domain
Oligomerization domain
Filoviridae
L-binding
domain
Rhabdoviridae
NO-binding domain
Paramyxoviridae
L-binding domain
Oligomerization domain
Oligomerization domain
N-RNA binding domain
L-binding domain
N-RNA binding domain
Pneumovirinae
Metapneumovirus
Pneumovirus
Rubulavirus
Paramyxovirinae
Avulavirus
Ferlavirus
Henipavirus
Morbillivirus
Respirovirus
Ephemerovirus
Vesiculovirus
Lyssavirus
Cytorhabdovirus
Nucleorhabdovirus
Novirhabdovirus
Ebolavirus
Marbugvirus
Bornavirus
50
100
150
200
250
300
350
400
450
Figure 3.3. Disorder and CICP mapped residues of Family MSAs for P. A.)
Paramyxoviridae B.) Rhabdoviridae C.) Filoviridae D.) The boxes above the alignments
correspond to the different binding Domains: oligomerization (green), N0 binding (blue),
N-RNA binding (red), L binding (yellow), X binding, which is unique to Bornaviridae
(orange), and interferon inhibitory domain, which is unique to Filoviridae (purple). All
other designations are as in Figure 1.
81
Co-evolution and Intra-Residue Contact
To extract information about the structural and functionally important residues
that are constrained by intra-protein evolutionary pressures, the results of four prediction
programs were combined into a consensus prediction. The results of intra-residue contact
predictors CORNET [54,55] and structure/functional/conserved residue predictions from
ConSEQ [39] were combined with the coevolving residue mutation predictor CAPS [43]
and structural/functional residue predictor XDET [38, 42] and the result is referred to as
the Co-evolution/Intra-residue contact prediction (CICP) consensus. The criteria for
CICP analysis require pair-wise identities of 19-90% and a MSA minimum of 10
sequences to produce statistically significant results. In addition, due to the L protein’s
large size (greater that 2000 amino acids), CORNET was unable to generate predictions.
L protein CICPs were observed for 25 members of Paramyxoviridae subfamily
Paramyxovirinae and 12 from Rhabdoviridae, (Fig 1, Fig S1) while Bornaviridae and
Filoviridae could not be analyzed.
L CICP Results
The 25 Paramyxoviridae sequences that met the analysis criteria display CICPs
throughout the length of the sequence and account for 1066 (40.8%) of the positions in
the MSA with 659 (25%) positions having greater than 50% conservation of CICPs. The
amino and carboxyl-terminal regions of the proteins contain the lowest concentration of
CICPs while the remainder of the protein shows high concentrations except in two large
areas that are absent of CICPs at MSA positions 739-861 and 1957-2096 (Fig 3.1A, Fig
S3.3A). The first region absent of CICPs (739-861) appears in the hinge region between
82
Domains II and III. The second CICP empty region (1957-2096) falls between Domains
V and VI. Areas displaying lower frequencies of CICPs are observed to have lower
levels of consecutive hydrophobic residues and lower MSA conservation scores.
Rhabdoviridae have 12 sequences meeting the analysis criteria that could be used to
estimate CICPs in L (RABV, MOKV, ABLV, SCRV, SVCV, CHPV, ISFV, VSNJV,
VSIV, VSSJV, BEFV, FLAV). The CICPs appear throughout the alignment accounting
for 759 (28%) positions in the MSA with 383 (14%) of those having more than 50%
conservation (Fig 3.1B, Fig S3.3B). The lower concentration of CICPs is apparent in the
amino and carboxyl-termini of the sequences and five regions that are absent of CICPs
are observed at MSA positions: 524-552, 1023-1084, 1622-1652, 1940-2105, 2109-2158.
The first region absent of CICPs (524-552) is between Domains I and II. The second
region (1023-1084) is between Domains III and IV. The third region (1622-1652) falls
between Domains V and VI, while the fourth (1940-2105) lies at the end of the hinge
between Domains V and VI and includes the first 14 residues of Domain VI. The final
region at 2109-2158 falls in Domain VI. Areas displaying lower frequencies of CICPs
were also confirmed to have lower levels of consecutive hydrophobic residues and lower
MSA conservation scores.
Within the Order, the CICPs are spread with their
concentration away from the N and C termini over 1396 (46%) positions with
conservation over 50% covering 580 (19%) positions (Fig 3.1E, Fig S3.3C). Five regions
are absent of CICPs: 581-626, 844-964, 1262-1313, 1907-1961, 2248-2442 and 23592486. The first region (581-626) maps between Domains I and II. The second region
(844-964) falls between Domains II and III. The third region (1262-1313) maps between
83
Domains III and IV of VSV. Two regions (1907-1961 and 2248-2442) map to the region
between Domains V and VI. The last region (2359-2486) falls after Domain VI.
P CICP Results
Only 16 sequences of the Paramyxoviridae meet the criteria of divergence and
sample size for CICP analysis. The CICPs are primarily in the last half of the MSA with
a total of 127 (31%) and 33 (8%) of those at greater than 50% conserved (Fig 3.2A, Fig
S3.4). The largest conserved region of CICPs is at positions 251-258 with greater than
40% conservation and lies in the L binding Domain. The L binding Domain displays the
largest concentration of highly conserved CICPs. The other areas of high concentration
are: the region of the oligomerization Domain (166-203) adjacent to the L binding
Domain; the L binding Domain (208-259); the region between the L binding Domain and
the N-RNA binding Domain (300-324); and the N-RNA binding Domain (326-346 and
370-389).
Areas displaying lower frequencies of CICPs also have lower levels of
consecutive hydrophobic residues and lower MSA conservation scores.
Discussion
Disorder Predictions for L and P
Disordered or intrinsically unstructured proteins (IUPs) are protein regions that
exist without a defined secondary structure. Such regions of disorder within proteins are
observed to be binding sites for proteins assuming a secondary structure that is observed
under x-ray crystallography when in association with the partner ligand [56,57]. When
unassociated from a binding-ligand these disordered regions often appear as regions of
84
missing electron density in the crystal structure because they do not take a static
secondary structure. Disordered regions allow proteins to have many binding partners and
different functions based upon the conformations. The results of the disorder prediction
for the families and the Order illustrate conservation for disorder among both the L
polymerase and P proteins.
Examination of the L protein results for the Order reveal two regions that are
conserved for disorder across all families and a region that is conserved amongst
Paramyxoviridae and Rhabdoviridae. The first region of disorder that spans all four
families is located at the amino-terminal (Fig 3.1). In Paramyxoviridae this region
appears in the first 20 amino acids for all viruses except HPV3 (Fig 3.1A). A study of
SENV revealed that amino acids 2-19 are required to form the L-L complex and their
deletion abolishes biological activity [58,59]. In Rhabdoviridae this region of disorder is
larger, close to twice the size, and it is especially pronounced in the Lyssaviruses and the
Nucleorhabdoviruses. The presence of this disorder conservation across all families
suggests that the oligomerization region is present in this location for all viruses of the
Order in this study. The second region of conserved disorder in all but Bornaviridae is in
Domain V and corresponds to the capping activity. This disorder region contains the
GxxT portion of the conserved capping motif GxxT[n]HR, which is responsible for the
unconventional capping mechanism that is conserved across the Order [49]. Additionally,
the CICPs show an overlap within this region in Domain V indicating the functional
conservation of this region in addition to the disorder (Fig 3.2). These results provide
evidence that the disorder may be related to the capping activity.
Amongst
85
Paramyxoviridae and Rhabdoviridae Domain VI contains a significant conserved region
of disorder (Fig 3.2, Fig 3.1A, Fig 3.1B). This region aligns with the conserved motifs II
and III of the methyltransferase that was shown in VSV, Bovine Ephermeral Fever virus
(BEFV), REBOV, RABV, Human Respiratory Syncitial virus (HRSV), MeV and SENV
[60,61] to be functionally related to the RrmJ heat shock [ribose-2′-O]-methyltransferase
of Escherichia coli and S-adenosylmethionine-dependent methyltransferase (SAM)
superfamily conserved motifs [60,61]. Motif II has the D-loop which contains an acidic
residue, Asp or Glu, whose side-chain hydrogen bonds with the ribose hydroxyl of SAM
[62]. Whenever the substrate (SAM), its analogs, or reaction product (SAH) are cocrystallized, they are found close to the invariant residues in motifs I-III [62].
The
protein disorder in motifs II-III is conserved in the Rublulaviruses, Avulaviruses,
Ferlavirus and Morbilliviruses in Paramyxovirinae as well as the Vesiculoviruses in
Rhabdoviridae.
This conservation suggests a disorder-order transition upon binding
SAM or it analogs that may assist in mRNA capping.
Disorder unique to
Paramyxoviriane falls between Domains II and III suggesting this location as a possible
interaction region specific to this sub-family as it is completely absent from
Pneumovirinae (Fig 3.1A, Fig S3.1A). In Filoviridae most of the observed disorder in
the ebolaviruses is not shared in Marbugvirus. The other region of family shared disorder
is in the hinge region between Domains V and VI (Fig 3.1C, Fig S3.1C). Bornaviridae
contains only a small amount of disorder with only one region that agrees with the rest of
the Order, the L oligomerization region (Fig 3.1D, Fig S3.1D).
86
In contrast to the L protein, the P protein has been well characterized for MeV,
RABV, SENV and VSV [27,63-65]. The consensus disorder results for the families from
this study for P agree with the evidence found in the previous works [27,63-65] and
expand the inferences to the other members within each family in this study. Unlike the
L polymerase and N nucleoprotein, which shares such similar organization and
conservation between families, P has very divergent sequences and has evolved different
domain organizations between families, therefore, cross family inference is illogical.
Paramyxoviridae has a conserved disordered region in the amino-terminal (Fig
3.1A, Fig S3.1A). This disordered region is located in the oligomerization domain of
both SENV [23] and Rinderpest virus [66].
The carboxyl-terminal regions of
Paramyxovirdae that display high disorder conservation were shown to be at the end of
the L binding Domain, and between the L binding Domain and the N-RNA binding
Domain, which agrees with 2009 results for Sendai virus [67] of these regions being
disordered and ambiguously disordered. The conservation of these ambiguous regions
indicates that they are important to the function of P and should be studied further.
Rhabdoviridae show strong family conservation in the amino-terminal region for
conserved disorder with the highest level of conservation at over 80% between the N0 and
L binding Domains (Fig 3.1B, Fig S3.1B). The first conserved region falls in the N0
binding region and it is conserved in the MSA amongst all but Novirhabdoviruses (Fig
3.1B, Fig S3.1B). The binding of N0 is important for the function of the RdRp complex
preventing N from polymerizing and binding to non-viral RNA [29,52,68]. These results
provide evidence for the conservation of disorder at this location in P for the N0 binding
87
Domain. Filoviridae contains approximately seven regions of conserved disorder (Fig
3.1C, Fig S3.2C). Five of these regions appear in the oligomerization Domain with the
other two in the interferon inhibitory region. The results are again in line with the
function of P for binding with the partner ligands. The separate oligomerization Domains
across Paramyxovirdae, Rhabdoviridae and Filoviridae display varying degrees of
disorder but all these members contain some disorder indicating a selection for disorder
in the process of oligomerization (Fig 3.2E). Bornaviridae showed a large percentage of
disorder throughout the sequence, 53.4% (Fig 3.2D, S3.2D). There are two regions of
disorder (Fig 3.2D, Fig S3.2D) and they fall in the X binding region and the N binding
region [69].
Co-evolution and Intra-Residue Contact for L and P
The functional constraints evolution applies to a protein modify amino acid
substitution rates at structural/functional sites, resulting in a higher conservation of these
sites with respect to the rest of the protein. Mutation of a residue in the protein at these
sites can be compensated with an additional mutation of corresponding residue locally or
further up or downstream, illustrating intra-protein residue co-evolution. This property
can be exploited by various prediction methods to identify these structural/functional
regions and the DisICC pipeline combines methods to produce a conservative prediction
of theses residues and regions. DisICC specifically uses the results of three intra-residue
contact and functional/structural/conservation predictors, CORNET, Conseq and Xdet,
and the coevolving residue mutation predictors, CAPS, combined into a consensus of
structural/functional predictions. ConSEQ makes predictions by estimating the rate of
88
amino acid evolution at each position in a MSA of homologous proteins [39]. The
underlying assumption of this approach is that, in general, structurally and functionally
important residues are slowly evolving. CORNET is a neural network-based method
using correlated mutations, sequence conservation, predicted secondary structure, and
evolutionary information [40, 41].
CAPS compares the correlated variance of the
evolutionary rates at two sites corrected by the time since the divergence of the protein
sequences [43]. XDET compares the mutational behavior of a residue position with the
mutational behaviors of the entire alignment, which assumes the positions showing a
family-dependent conservation pattern will have similar mutational behaviors as the rest
of the family [38, 42]. All these methods are combined into the CICP, which correlates
the structure and functional predictions with the residues that are constrained by intraprotein evolutionary pressures.
The concentration of CICPs correlates with the
evolutionary distances between the sequences used – the closer the evolutionary distances
within a region the higher the concentration of CICPs for that region given that it also
contains structural or functionally important residues.
The Paramyxoviridae and Rhabdoviridae CICPs for L are spread along the Order
MSA with their concentration away from the amino and carboxyl-termini and one region
that is absent of CICPs (Fig 3.2E, Fig S3.3C). The low level of CICPs in the aminoterminus of the MSAs, combined with the conserved disorder results at this position, the
low level of hydrophobicity and sequence conservation support the experimental
evidence that this is a binding region and participates in forming the L-L complex
[58,59]. The CICPs for both families are concentrated in the Domains.
However,
89
Domains V and VI have lower conservation of CICPs than the others. This lower level
of CICP conservation coincides with the disorder conservation in both Domains. This is
the ligand binding/interaction site, and as stated, coincides with the capping activities of
these Domains. It can be inferred that the rest of the Order has these functions in this
location. The CICPs results for L in Paramyxoviridae have one distinct region with a
complete absence of CICP conservation (Fig 3.1A, Fig S3.3A). The region between
Domains II and III is highly conserved for disorder, indicating that this region is selected
for flexibility or another function where disorder is beneficial specifically to
Paramyxovirinae, as it is absent from the rest of the Order (Fig 1E). From the evidence
of this study and the corroborating findings of individual viral L proteins from previous
studies, It can be inferred that Rhabdoviridae and Paramyxoviridae, and more generally
the other viruses in Mononegavirales, have similar functional/structural regions
corresponding specifically to those regions showing conservation in disorder and low
intra-protein co-evolution even though they may have weak amino acid sequence
conservation in L across the Order.
The level of CICPs in P is significantly lower than what was observed in L (Fig
3.2, Fig 3.2B). This is correlated with the level of disorder observed in P and agrees with
previous observations of the multiple partner binding regions that have been identified in
P [29,52,67-69]. Further, the presence of the disorder and absence of the intra-protein
interactions in the binding regions supports what we would expect biologically: low
hydrophobicity, high levels of disorder and low levels of intra-protein co-evolution. This
agrees with results for defining binding regions from the previous study of the
90
nucleoprotein [47]. And thus based on the DisICC results within the viral families for P,
the experimentally validated interaction regions (oligomerization, N0 binding, N-RNA
binding, X binding, interferon inhibition) can be inferred for the other members of each
family.
The initial study that used the DisICC pipeline [47] combined analysis of 63
representative nucleoprotein sequences from the four viral families (Bornaviridae,
Filoviridae, Rhabdoviridae, and Paramyxoviridae). We predicted the regions of protein
disorder, intra-residue contact and co-evolving residues, and correlated between location
and conservation of predicted regions. The results reveal a strong division between
families while highlighting conservation within individual families. The results suggest
that conserved regions among the nucleoproteins, within Rhabdoviridae and
Paramyxoviradae, but also generally among all members of the Order, reflect an
evolutionary advantage in maintaining these sites for the viral nucleoprotein as part of the
transcription/replication machinery. Specifically, the results indicated conservation for
disorder in the C-terminus region of the representative proteins that is important for
interacting with the P and L polymerase during transcription and replication.
Additionally, the C-terminus region of the protein preceding the disordered region is
predicted to be important for interacting with the encapsidated genome. We identified
portions of the N-terminus as being responsible for N:N stability and interactions by the
presence or lack of CICPs. These results were validated against experimental
observations from nucleoprotein interactions with P, other N proteins and the RNA
genome. Additional validation of the predictions of disorder and correlated mutations
91
were corroborated against structural information. The existing crystal structures for the
nucleoprotein complex of RABV (pdb id - 2GTT) and the VSIV N:RNA & P complex
(pdb id – 3HHZ) were used. The amino acid sequence information from the protein
database files were extracted and aligned with the corresponding nucleoprotein amino
acid sequence used in the predictions. The aligned positions were then used to map the
appropriate predictions to the crystal structure. To explore predicted features that may
point to protein-protein interaction MolProbity all-atom-contact analysis [70] was
conducted to verify interacting residues between the N and P proteins, and RNA
interactions and compared to the disorder and CICP results.
Further validation of the results presented here by current experimental
observations illustrates that the DisICC pipeline, a combination of evolutionary
dynamics, disorder prediction, intra-protein structure/function predictions and coevolving residue prediction provides the ability to identify residues and regions important
for protein-ligand interactions, intra-protein interactions and protein-protein interfaces
without knowledge of structure.
The DisICC pipeline’s use of sequence information to characterize proteins by
predicting the residues and regions necessary to disrupt viruses with little available
structural information can quickly provide target information to aide researchers with
response and development of treatments. DisICC results can also identify slowly
evolving protein regions of viruses thereby indicating new targets in the development of
lasting treatments. DisICC can also be applied to other multi-protein systems where
identifying regions to disrupt structural/functionally conserved residues. In summary, the
92
DisICC pipeline is a powerful tool for rapid protein disorder and structural/functional
characterization, and it can provide prediction of protein interaction regions that are
easily validated experimentally.
Materials and Methods
Multiple Sequence Alignment
The L and P multiple sequence alignments for each family were created by
submitting
sequences
to
the
MAFFT
ver.6
(http://mafft.cbrc.jp/alignment/server/index.html) using the E-INS-i strategy.
server
Each
family alignment was manually curated to ensure optimal alignments. For the alignment
of the entire Order, each independent family alignment were organized into one FASTA
file and submitted to the MAFFT ver. 6 alignment server using the E-INS-i strategy[71].
The MSA output was then manually curated due to the wide divergence of the sequences.
The sequences used in this study in alignment order for L are: HMPNV, Human
Metapneumovirus (YP_012613.1);
AVPNV, Avian Pneumovirus (AAT58244.1);
HRSVB1, Human Respiratory Syncytial Virus B1 (NP_056866.1); HRSVA2, Human
Respiratory Syncytial Virus A2 (P28887); HRSVS2, Human Respiratory Syncytial Virus
S2 (AAC57029.1); RSV, Respiratory Syncytial Virus (NP_044598.1); BRSV, Bovine
Respiratory Syncytial Virus (NP_048058.1); PNVM15, Pneumonia Virus of Mice 15
(AAW02843.1); PNVMJ3666, Pneumonia Virus of Mice J3666 (YP_173335.1); MuV,
Mumps Virus (NP_054714.1); TIOV, Tioman Virus (NP_665871.1); MENV, Menangle
Virus (YP_415514.1); SPIV41, Simian Parainfluenza Virus (YP_138510.1); HPIV2,
93
Human Parainfluenza Virus 2 (NP_598406.1); SPIV5, Simian Parainfluenza Virus 5
(YP_138518.1); AVPMV6, Avian Paramyxovirus 6 (NP_150063.1); GPV, Goose
Paramyxovirus SF02 (NP_872278.1); NCDV, Newcastle Disease Virus (NP_071471.1);
FDLV,
Fer-de-lance
Virus
(NP_899661.1);
TUPV,
Tupaia
Paramyxovirus
(NP_054697.1); NIPH, Nipah Virus (NP_112028.1); HV, Hendra Virus (NP_047113.2);
MOSV, Mossman Virus (NP_958055.1); BEIV, Beilong Virus (YP_512254.1); JV, J
Virus (YP_338085.1); CDV, Canine Distemper Virus (NP_047207.1); PDV, Phocine
Distemper Virus (CAA70843.1); DMV, Dolphin Morbillivirus (NP_945030.1); PDPRV,
Peste-de-petits-ruminants Virus (YP_133828.1 ); MeV, Measles Virus (NP_056924.1);
RPV, Rinderpest Virus (YP_087126.2); HPV1, Human Parainfluenza Virus 1
(NP_604442.1); SENV, Sendai Virus (NP_056879.1); BPV3, Bovine Parainfluenza Virus
3 (NP_037646.1); HPV3, Human Parainfluenza Virus 3 (NP_067153.1); FLAV,
Flanders Virus (AAN73288.1); BEFV, Bovine Ephemeral Fever Virus (NP_065409.1);
SCRV, Siniperca Chuatsi Rhabdovirus (YP_802942.1); ISFV, Isfahan Virus (Q5K2K3);
CHPV, Chandipura Virus (CAH17543.1); SVCV, Spring Viremia of Carp Virus
(NP_116748.1); VSNJV, Vesicular Stomatitis New Jersey Virus (P16379); VSIV,
Vesicular Stomatitis Indiana Virus (NP_041716.1); VSSJV, Vesicular Stomatitis San
Juan Virus (P03523); ABLV, Australian Bat lyssavirus (NP_478343.1); RABV, Rabies
Virus (NP_056797.1); MOKV, Mokola Lyssavirus (YP_142354.1); NCMV, Northern
Cereal Mosaic Virus (NP_597914.1); LNYV, Lettuce Necrotic Yellows Virus
(YP_425092.1); SYNV, Sonchus Yellow Net Virus (NP_042286.1); MFSV, Maize Fine
Streak Virus (YP_052849.1); RYSV, Rice Yellow Stunt Virus (NP_620502.1); MMV,
94
Maize Mosaic Virus (YP_052855.1); TVCV, Taro Vein Chlorosis Virus (YP_224083.1);
SNAKV, Snakehead Virus (NP_050585.1); VHSV, Viral Hemorrhagic Septicemia Virus
(NP_049550.1); HIRV, Hirame Virus (NP_919035.1); IHNV, Infectious Hematopoietic
Necrosis Virus (NP_042681.1); SEBOV, Sudan Ebola Virus (YP_138527.1); ZEBOV,
Zaire Ebola Virus (NP_066251.1); RSV, Respiratory Syncytial Virus (NP_044598.1);
MARV, Lake Victoria Marburgvirus (NP_042031.1); BDV, Borna Virus (NP_042024.1)
The sequences used in the study in alignment order for P are: HMPV, Human
Metapneumovirus
(YP_012606.1);
AVPN,
Avian
Pneumovirus
(AAT58237.1);
HRSVB1, | Human Respiratory Syncytial Virus B1 (O42062); HRSVA2, Human
Respiratory Syncytial Virus A2 (P03421); HRSVS2, Human Respiratory Syncytial Virus
A2 (O09633);
RSV, Respiratory Syncytial Virus (NP_044592.1); BRSV, Bovine
Respiratory Syncitial virus (P33454 ); PNVM15, Pneumonia Virus of Mice
(AAW02835.1); PNVMJ3666, Pneumonia Virus of Mice (YP_173327.1); MuV, Mumps
Virus (NP_054708.1); TIOV, Tioman Virus (NP_665865.1); MENV, Menangle Virus
(YP_415509.1); SPIV41, Simian Parainfluenza Virus (YP_138505.1); HPIV2, Human
Parainfluenza
Virus
2
(NP_599019.1);
SPIV5,
(YP_138512.1); AVPM, Avian Paramyxovirus
Simian
Parainfluenza
(NP_150058.1); GPV,
Virus
Goose
Paramyxovirus (NP_872274.1); NDV, Newcastle Disease Virus (NP_071467.1); FDLV,
Fer-de-lance Virus (NP_899656.1); TUPV, Tupaia Paramyxovirus (NP_054691.1);
NIPH, Nipah Virus (NP_112022.1); HEND, Hendra Virus (NP_047107.2); MOSV,
Mossman Virus (NP_958049.1); BEIV, Beilong Virus (YP_512245.1);
(YP_338076.1); CDV,
JV,
J-virus
Canine Distemper Virus (NP_047202.1); PDV, Phocine
95
Distemper Virus (CAA53573.1); DMV, Dolphin Morbillivirus (NP_945025.1); PDPR,
Peste-des-petits-ruminants Virus (YP_133822.1); MeV, Measles Virus (NP_056919.1);
RPV, Rinderpest Virus (YP_087121.2); HPIV1, Human Parainfluenza Virus
(NP_604435.1); SENV, Sendai Virus (NP_056873.1); BPIV, Bovine Parainfluenza Virus
(NP_037642.1); HPIV3, Human Parainfluenza Virus (NP_067149.1); FLAV, Flanders
Virus (AAN73284.1); BEFV, Bovine Ephemeral Fever Virus (NP_065399.1); SCRV,
Siniperca Chuatsi Rhabdovirus (YP_802938.1); ISFV, Isfahan Virus (Q5K2K6); CHPV,
Chandipura Virus (P16380|); SVCV, Spring Viremia of Carp Virus (NP_116745.1);
VSNJ, Vesicular Stomatitus Virus New Jersey (P04877); VSIV, Vesicular Stomatitis
Indiana Virus (NP_041713.1); VSSJ, Vesicular Stomatitis San Juan Virus (P03520);
ABLV, Australian Bat Lyssavirus (NP_478340.1); RABV, Rabies Virus (NP_056794.1);
MOKV, Mokola Virus (YP_142351.1); NCMV, Northern Cereal Mosaic Virus
(NP_057955.1); LNYV, Lettuce Necrotic Yellows Virus (YP_425088.1); SYNV, Sonchus
Yellow Net Virus (NP_042284.1); MFSV, Maize Fine Streak Virus (YP_052844.1);
RYSV, Rice Yellow Stunt Virus (NP_620497.1); MMV, Maize Mosaic Virus
(YP_052851.1); TVCV, Taro Vein Chlorosis Virus (YP_224079.1); SNAK, Snakehead
Rhabdovirus
(NP_050581.1);
VHSV,
Viral
Hemorrhagic
Septicemia
Virus
(NP_049546.1); IHNV, Infectious Hematopoietic Necrosis Virus (NP_042677.1); HIRV,
Hirame Rhabdovirus (NP_919031.1); SEBO, Sudan Ebolavirus (YP_138521.1); REBO,
Reston Ebolavirus (NP_690581.1); ZEBO, Zaire Ebola Virus (NP_066244.1); MARV,
Lake Victoria Marburgvirus (NP_042026.1); BDV, Borna Disease Virus (NP_042021.1)
96
Disorder
Disorder calculations were performed using PONDR Fit [72], IUPred [73,74],
RONN [48] and DisEMBL [75] prediction programs. PONDR Fit was run under the
default setting. IUPred was run under the long sequence default settings. DisEMBL was
run using default settings and the Hot-loop and Coil results were both included in my
evaluation. RONN was run under default setting. All the disorder prediction results from
these methods were normalized to a disordered(1) or non-disordered(0). This assignment
was determined by evaluating if the disordered value for a residue was above or below
the predictions disorder threshold (in the case of most methods this threshold is 0.5).
These normalized values were then combined and averaged to a consensus value for each
residue. This calculated value is used as the overall indicator for the prediction of
disorder in the results. It should be noted that this consensus method provides an overall
conservative prediction of disorder revealing residues with high probability of disorder
and preventing over-prediction.
Correlated Mutations and
Intra-Residue Contact Prediction
The correlated mutation prediction programs used in this study were XDET
[76,77] and CAPS [78] and the intra-residue contact prediction programs implemented
were ConSEQ [79] and CORNET [54,55]. The input files for these applications were
generated by calculating the pair wise percent identities within each family for L and for
the Order for P. Amino acid sequence identities between 19-90% were used in the
analyses. XDET, CAPS and CORNET were both run under the default parameters and
97
ConSEQ used all defaults except the “amino acid conservation method” was set to
Bayesian. The resulting predictions from each program were combined and any residues
that showed a positive agreement of three or more predictors was classified as a CICP.
Calculation of conservation of CICPs within the alignments is calculated per alignment
position by summing up the CICP occurrences per column and dividing by the total
number of sequences that participated in the CICP study for that alignment.
Hydrophobic Residues and MSA Conservation
The correlation of residues in the MSAs that contained hydrophobic residues
and/or high MSA sequence conservation was studied using Jalview [80].
Jalview
provides visualization of hydrophobicity and sequence conversation. Conservation
annotation scores were then compared with hydrophobicity for the MSA residues that
displayed CICPs.
98
References
1.
Koser ML, McGettigan JP, Tan GS, Smith ME, Koprowski H, et al. (2004) Rabies
virus nucleoprotein as a carrier for foreign antigens. Proc Natl Acad Sci USA 101:
9405–9410. doi:10.1073/pnas.0403060101.
2.
Lichty BD, Power AT, Stojdl DF, Bell JC (2004) Vesicular stomatitis virus: reinventing the bullet. Trends Mol Med 10: 210–216.
doi:10.1016/j.molmed.2004.03.003.
3.
Johnson JE, Coleman JW, Kalyan NK, Calderon P, Wright KJ, et al. (2009) In
vivo biodistribution of a highly attenuated recombinant vesicular stomatitis virus
expressing HIV-1 Gag following intramuscular, intranasal, or intravenous
inoculation. Vaccine 27: 2930–2939. doi:10.1016/j.vaccine.2009.03.006.
4.
Horie M, Honda T, Suzuki Y, Kobayashi Y, Daito T, et al. (2010) Endogenous
non-retroviral RNA virus elements in mammalian genomes. Nature 463: 84–87.
doi:10.1038/nature08695.
5.
Becker S, Huppertz S, Klenk HD, Feldmann H (1994) The nucleoprotein of
Marburg virus is phosphorylated. J Gen Virol 75 ( Pt 4): 809–818.
6.
Watanabe S, Noda T, Kawaoka Y (2006) Functional mapping of the nucleoprotein
of Ebola virus. J Virol 80: 3743.
7.
Chuang JL, Perrault J (1997) Initiation of vesicular stomatitis virus mutant polR1
transcription internally at the N gene in vitro. J Virol 71: 1466–1475.
8.
Whelan SPJ, Barr JN, Wertz GW (2004) Transcription and replication of
nonsegmented negative-strand RNA viruses. Curr Top Microbiol Immunol 283:
61–119.
9.
Cevik B, Kaesberg J, Smallwood S, Feller JA, Moyer SA (2004) Mapping the
phosphoprotein binding site on Sendai virus NP protein assembled into
nucleocapsids. Virology 325: 216–224. doi:10.1016/j.virol.2004.05.012.
10.
Murphy LB, Loney C, Murray J, Bhella D, Ashton P, et al. (2003) Investigations
into the amino-terminal domain of the respiratory syncytial virus nucleocapsid
protein reveal elements important for nucleocapsid formation and interaction with
the phosphoprotein. Virology 307: 143–153.
11.
Poch O, Blumberg BM, Bougueleret L, Tordo N (1990) Sequence comparison of
five polymerases (L proteins) of unsegmented negative-strand RNA viruses:
theoretical assignment of functional domains. J Gen Virol 71 ( Pt 5): 1153–1162.
99
12.
Müller R, Poch O, Delarue M, Bishop DH, Bouloy M (1994) Rift Valley fever
virus L segment: correction of the sequence and possible functional role of newly
identified regions conserved in RNA-dependent polymerases. J Gen Virol 75 ( Pt
6): 1345–1352.
13.
Smallwood S, Easson CD, Feller JA, Horikami SM, Moyer SA (1999) Mutations
in Conserved Domain II of the Large (L) Subunit of the Sendai Virus RNA
Polymerase Abolish RNA Synthesis. Virology 262: 375–383.
doi:10.1006/viro.1999.9933.
14.
Schnell MJ, Conzelmann KK (1995) Polymerase activity of in vitro mutated rabies
virus L protein. Virology 214: 522–530. doi:10.1006/viro.1995.0063.
15.
Das T, Banerjee AK (1993) Acidic domain of the phosphoprotein (P) of vesicular
stomatitis virus differentially interacts with homologous and heterologous
nucleocapsid protein (N). Cell Mol Biol Res 39: 93–100.
16.
Feller JA, Smallwood S, Horikami SM, Moyer SA (2000) Mutations in Conserved
Domains IV and VI of the Large (L) Subunit of the Sendai Virus RNA Polymerase
Give a Spectrum of Defective RNA Synthesis Phenotypes. Virol 269: 426–439.
doi:10.1006/viro.2000.0234.
17.
Li J, Chorba JS, Whelan SPJ (2007) Vesicular Stomatitis Viruses Resistant to the
Methylase Inhibitor Sinefungin Upregulate RNA Synthesis and Reveal Mutations
That Affect mRNA Cap Methylation. J Virol 81(8): 4104-4115
18.
Ogino T, Banerjee AK (2007) Unconventional mechanism of mRNA capping by
the RNA-dependent RNA polymerase of vesicular stomatitis virus. Mol Cell 25:
85–97. doi:10.1016/j.molcel.2006.11.013.
19.
Grdzelishvili VZ, Smallwood S, Tower D, Hall RL, Hunt DM, et al. (2005) A
single amino acid change in the L-polymerase protein of vesicular stomatitis virus
completely abolishes viral mRNA cap methylation. J Virol 79(12) 7327-7337
20.
Rahmeh AA, Schenk AD, Danek EI, Kranzusch PJ, Liang B, et al. (2010)
Molecular architecture of the vesicular stomatitis virus RNA polymerase. Proc
Natl Acad Sci USA 107: 20075–20080. doi:10.1073/pnas.1013559107.
21.
Emerson SU, Wagner RR (1972) Dissociation and reconstitution of the
transcriptase and template activities of vesicular stomatitis B and T virions. J Virol
10: 297–309.
22.
Möller P, Pariente N, Klenk H-D, Becker S (2005) Homo-oligomerization of
Marburgvirus VP35 is essential for its function in replication and transcription. J
Virol 79: 14876–14886. doi:10.1128/JVI.79.23.14876-14886.2005.
100
23.
CURRAN J, BOECK R, LIN-MARQ N, LUPAS A, Kolakofsky D (1995)
Paramyxovirus Phosphoproteins Form Homotrimers as Determined by an Epitope
Dilution Assay, via Predicted Coiled Coils. Virology 214: 139–149.
doi:10.1006/viro.1995.9946.
24.
Gao Y, Lenard J (1995) Cooperative binding of multimeric phosphoprotein (P) of
vesicular stomatitis virus to polymerase (L) and template: pathways of assembly. J
Virol 69: 7718–7723.
25.
Tarbouriech N, Curran J, Ruigrok RW, Burmeister WP (2000) Tetrameric coiled
coil domain of Sendai virus phosphoprotein. Nat Struct Biol 7: 777–781.
doi:10.1038/79013.
26.
Gerard FCA, Ribeiro E de A, Albertini AAV, Gutsche I, Zaccai G, et al. (2007)
Unphosphorylated RhabdoviridaePhosphoproteins Form Elongated Dimers in
Solution †. Biochemistry 46: 10328–10338. doi:10.1021/bi7007799.
27.
Gerard FCA, Ribeiro E de A, Leyrat C, Ivanov I, Blondel D, et al. (2009) Modular
organization of rabies virus phosphoprotein. J Mol Biol 388: 978–996.
doi:10.1016/j.jmb.2009.03.061.
28.
Chenik M, Chebli K, Gaudin Y, Blondel D (1994) In vivo interaction of rabies
virus phosphoprotein (P) and nucleoprotein (N): existence of two N-binding sites
on P protein. J Gen Virol 75 ( Pt 11): 2889–2896.
29.
Mavrakis M, Méhouas S, Réal E, Iseni F, Blondel D, et al. (2006) Rabies virus
chaperone: identification of the phosphoprotein peptide that keeps nucleoprotein
soluble and free from non-specific RNA. Virology 349: 422–429.
doi:10.1016/j.virol.2006.01.030.
30.
Fu ZF, Zheng Y, Wunner WH, Koprowski H, Dietzschold B (1994) Both the Nand the C-terminal domains of the nominal phosphoprotein of rabies virus are
involved in binding to the nucleoprotein. Virol 200: 590–597.
doi:10.1006/viro.1994.1222.
31.
Das SC, Pattnaik AK (2004) Phosphorylation of vesicular stomatitis virus
phosphoprotein P is indispensable for virus growth. J Virol 78: 6420–6430.
doi:10.1128/JVI.78.12.6420-6430.2004.
32.
Schoehn G, Iseni F, Mavrakis M, Blondel D, Ruigrok RW (2001) Structure of
recombinant rabies virus nucleoprotein-RNA complex and identification of the
phosphoprotein binding site. J Virol 75: 490–498. doi:10.1128/JVI.75.1.490498.2001.
33.
Albertini A, Schoehn G, Weissenhorn W (2008) Structural aspects of rabies virus
replication - Springer. Cell Mol Life Sci 65: 282-294
101
34.
Albertini AAV (2006) Crystal Structure of the Rabies Virus Nucleoprotein-RNA
Complex. Science 313: 360–363. doi:10.1126/science.1125280.
35.
Green TJ (2006) Structure of the Vesicular Stomatitis Virus Nucleoprotein-RNA
Complex. Science 313: 357–360. doi:10.1126/science.1126953.
36.
Ding H, Green T, Lu S (2006) Crystal structure of the oligomerization domain of
the phosphoprotein of vesicular stomatitis virus. J Virol. 80(6) 2808-2818
37.
Mavrakis M, McCarthy AA, Roche S, Blondel D, Ruigrok RWH (2004) Structure
and Function of the C-terminal Domain of the Polymerase Cofactor of Rabies
Virus. J Mol Biol 343: 819–831. doi:10.1016/j.jmb.2004.08.071.
38.
Ribeiro EA, Favier A, Gerard FCA, Leyrat C, Brutscher B, et al. (2008) Solution
structure of the C-terminal nucleoprotein-RNA binding domain of the vesicular
stomatitis virus phosphoprotein. J Mol Biol 382: 525–538.
doi:10.1016/j.jmb.2008.07.028.
39.
Green TJ, Luo M (2009) Structure of the vesicular stomatitis virus nucleocapsid in
complex with the nucleocapsid-binding domain of the small polymerase cofactor,
P. Proc Natl Acad Sci USA 106: 11713–11718. doi:10.1073/pnas.0903228106.
40.
Ribeiro E de A, Leyrat C, Gerard FCA, Albertini AAV, Falk C, et al. (2009)
Binding of rabies virus polymerase cofactor to recombinant circular nucleoproteinRNA complexes. J Mol Biol 394: 558–575. doi:10.1016/j.jmb.2009.09.042.
41.
Poenisch M, Wille S, Ackermann A, Staeheli P, Schneider U (2007) The X protein
of borna disease virus serves essential functions in the viral multiplication cycle. J
Virol 81: 7297–7299. doi:10.1128/JVI.02468-06.
42.
la Torre de JC (2002) Molecular biology of Borna disease virus and persistence.
Front Biosci 7: d569–d579.
43.
Schneider U (2005) Novel insights into the regulation of the viral polymerase
complex of neurotropic Borna disease virus. Virus Research 111: 148–160.
doi:10.1016/j.virusres.2005.04.006.
44.
Poenisch M, Wille S, Staeheli P, Schneider U (2008) Polymerase read-through at
the first transcription termination site contributes to regulation of borna disease
virus gene expression. J Virol 82: 9537–9545. doi:10.1128/JVI.00639-08.
45.
Poenisch M, Staeheli P, Schneider U (2008) Viral accessory protein X stimulates
the assembly of functional Borna disease virus polymerase complexes. J Gen Virol
89: 1442–1445. doi:10.1099/vir.0.2008/000638-0.
46.
Poenisch M, Unterstab G, Wolff T, Staeheli P, Schneider U (2004) The X protein
102
of Borna disease virus regulates viral polymerase activity through interaction with
the P protein. J Gen Virol 85: 1895–1898. doi:10.1099/vir.0.80002-0.
47.
Cleveland SB, Davies J, McClure MA (2011) A bioinformatics approach to the
structure, function, and evolution of the nucleoprotein of the order
mononegavirales. PLoS One 6: e19275. doi:10.1371/journal.pone.0019275.
48.
Yang ZR, Thomson R, McNeil P, Esnouf RM (2005) RONN: the bio-basis
function neural network technique applied to the detection of natively disordered
regions in proteins. Bioinformatics 21: 3369–3376.
doi:10.1093/bioinformatics/bti534.
49.
Li J, Rahmeh A, Morelli M, Whelan SPJ (2008) A Conserved Motif in Region V
of the Large Polymerase Proteins of Nonsegmented Negative-Sense RNA Viruses
That Is Essential for mRNA Capping. J Virol 80(2): 775-784
50.
Jordan IK, Ben A Sutter IV, McClure MA (2000) Molecular evolution of the
Paramyxoviridae and Rhabdoviridae multiple-protein-encoding P gene. Mol Biol
Evol 17(1): 75-086
51.
Paul PR, Chattopadhyay D, Banerjee AK (1988) The functional domains of the
phosphoprotein (NS) of vesicular stomatitis virus (Indiana serotype). Virology
166: 350–357.
52.
Chen M, Ogino T, Banerjee AK (2007) Interaction of vesicular stomatitis virus P
and N proteins: identification of two overlapping domains at the N terminus of P
that are involved in N0-P complex formation and encapsidation of viral genome
RNA. J Virol 81: 13478–13485. doi:10.1128/JVI.01244-07.
53.
Emerson SU, Schubert M (1987) Location of the binding domains for the RNA
polymerase L and the ribonucleocapsid template within different halves of the NS
phosphoprotein of vesicular stomatitis virus. Proc Natl Acad Sci USA 84: 5655–
5659.
54.
Olmea O, Valencia A (1997) Improving contact predictions by the combination of
correlated mutations and other sources of sequence information. Folding and
Design 2: S25–S32. doi:10.1016/S1359-0278(97)00060-6.
55.
Fariselli P, Casadio R (1999) A neural network based predictor of residue contacts
in proteins. Protein Eng 12: 15–21.
56.
Tompa P (2002) Intrinsically unstructured proteins. Trends Biochem Sci 27: 527–
533. doi:10.1016/S0968-0004(02)02169-2.
57.
Tsai CJ, Ma B, Sham YY, Kumar S, Nussinov R (2001) Structured disorder and
conformational selection. Proteins 44: 418–427.
103
58.
Cevik B, Smallwood S, Moyer SA (2003) The L-L oligomerization domain resides
at the very N-terminus of the sendai virus L RNA polymerase protein. Virology
313: 525–536.
59.
Cevik B, Smallwood S, Moyer SA (2007) Two N-terminal regions of the Sendai
virus L RNA polymerase protein participate in oligomerization. Virology 363:
189–197. doi:10.1016/j.virol.2007.01.032.
60.
Ferron F, Longhi S, Henrissat B, Canard B (2002) Viral RNA-polymerases-a
predicted 2'-O-ribose methyltransferase domain shared by all Mononegavirales.
Trends Biochem Sci 27: 222–224.
61.
Li J, Fontaine-Rodriguez EC, Whelan SPJ (2005) Amino Acid Residues within
Conserved Domain VI of the Vesicular Stomatitis Virus Large Polymerase Protein
Essential for mRNA Cap Methyltransferase Activity. J Virol 79(21): 13373-13384
62.
Cheng X (1995) Structure and function of DNA methyltransferases. Annu Rev
Biophys Biomol Struct 24: 293–318. doi:10.1146/annurev.bb.24.060195.001453.
63.
Johansson K (2003) Crystal Structure of the Measles Virus Phosphoprotein
Domain Responsible for the Induced Folding of the C-terminal Domain of the
Nucleoprotein. J Biol Chem 278: 44567–44573. doi:10.1074/jbc.M308745200.
64.
Karlin D, Ferron F, Canard B, Longhi S (2003) Structural disorder and modular
organization in Paramyxovirinae N and P. J Gen Virol 84: 3239–3252.
65.
Karlin D, Belshaw R (2012) Detecting remote sequence homology in disordered
proteins: discovery of conserved motifs in the N-termini of Mononegavirales
phosphoproteins. PLoS One 7: e31719. doi:10.1371/journal.pone.0031719.
66.
Rahaman A, Srinivasan N, Shamala N, Shaila MS (2004) Phosphoprotein of the
rinderpest virus forms a tetramer through a coiled coil region important for
biological function. A structural insight. J Biol Chem 279: 23606–23614.
doi:10.1074/jbc.M400673200.
67.
Gerard F, Ribeiro E Jr, Leyrat C (2009) Modular organization of rabies virus
phosphoprotein. J Mol Biol 388: 978-996
68.
Curran J, Marq JB, Kolakofsky D (1995) An N-terminal domain of the Sendai
paramyxovirus P protein acts as a chaperone for the NP protein during the nascent
chain assembly step of genome replication. J Virol 69: 849–855.
69.
Schwemmle M, Salvatore M, Shi L, Richt J, Lee CH, et al. (1998) Interactions of
the borna disease virus P, N, and X proteins and their functional implications. J
Biol Chem 273: 9007–9012.
104
70.
Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, et al. (2007)
MolProbity: all-atom contacts and structure validation for proteins and nucleic
acids. Nucleic Acids Research 35: W375–W383. doi:10.1093/nar/gkm216.
71.
Pollock DD, Taylor WR, Goldman N (1999) Coevolving protein residues:
maximum likelihood identification and relationship to structure. J Mol Biol 287:
187–198. doi:10.1006/jmbi.1998.2601.
72.
Bin Xue, Dunbrack RL, Williams RW, Dunker AK, Uversky VN (2010) PONDRFIT: A meta-predictor of intrinsically disordered amino acids. BBA - Proteins and
Proteomics 1804: 996–1010. doi:10.1016/j.bbapap.2010.01.011.
73.
Dosztányi Z, Csizmok V, Tompa P, Simon I (2005) IUPred: web server for the
prediction of intrinsically unstructured regions of proteins based on estimated
energy content. Bioinformatics 21: 3433–3434. doi:10.1093/bioinformatics/bti541.
74.
Dosztányi Z, Csizmók V, Tompa P, Simon I (2005) The pairwise energy content
estimated from amino acid composition discriminates between folded and
intrinsically unstructured proteins. J Mol Biol 347: 827–839.
doi:10.1016/j.jmb.2005.01.071.
75.
Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, et al. (2003) Protein disorder
prediction: implications for structural proteomics. Structure 11: 1453–1459.
76.
del Sol Mesa A, Pazos F, Valencia A (2003) Automatic Methods for Predicting
Functionally Important Residues. J Mol Biol 326: 1289–1302. doi:10.1016/S00222836(02)01451-1.
77.
Pazos F, Rausell A, Valencia A (2006) Phylogeny-independent detection of
functional residues. Bioinformatics 22: 1440–1448.
doi:10.1093/bioinformatics/btl104.
78.
Fares MA, McNally D (2006) CAPS: coevolution analysis using protein
sequences. Bioinformatics 22: 2821–2822. doi:10.1093/bioinformatics/btl493.
79.
Berezin C, Glaser F, Rosenberg J, Paz I, Pupko T, et al. (2004) ConSeq: the
identification of functionally and structurally important residues in protein
sequences. Bioinformatics 20: 1322–1324. doi:10.1093/bioinformatics/bth070.
80.
Waterhouse AM, Procter JB, Martin DMA, Clamp M, Barton GJ (2009) Jalview
Version 2--a multiple sequence alignment editor and analysis workbench.
Bioinformatics 25: 1189–1191. doi:10.1093/bioinformatics/btp033.
105
THE DISICC PIPELINE AND DISICC DATABASE
To facilitate this research I designed a pipeline to streamline the data acquisition,
storage and evaluation of the information. The goal of the pipeline is to accept protein
MSAs and run the sequences through all the disorder methods, intra-residue prediction,
functional and structural residue prediction and intra-protein compensatory mutation
predictions. Although workflow applications such as Taverna existed when this study
began, their maturity and overall utility was lacking and not in line with the software
model I was interested in pursuing. Thus, I chose to create my own system and to
leverage the work in data management that the Montana State University, Research
Computing Group was also researching.
Software Stack and Application
The initial proof of concept pipeline and database were implemented in PHP and
MySQL.
However, it quickly became obvious PHP was lacking many informatics
libraries and robust frameworks. So to implement the pipeline properly and in an easily
maintainable way I chose the Ruby language due to the community support and number
of existing packages such as BioRuby, the maturity of its web frameworks, and the
support for open source. In addition to making the pipeline useful for my research, I
decided to make the pipeline a web application for an easy and simple user interface for
other investigators.
Since many scientific software applications fail to achieve a wide
user base and eventually become abandoned, providing an accessible user interface
encourages adoption and promotes re-use of DisICC. Ruby on Rails was chosen as the
106
web framework for the application based on maturity, the open source community, and
rapid development support.
For data model abstraction the DataMapper Object
Relational Mapper (ORM) was used for flexibility and ease of use. The use of an ORM
also provides the ability to change the database backend technology at some point:
DataMapper provides abstraction for database storage without having to modify the core
software logic.
The current application server is an Ubuntu based linux server using Apache as
the web-server. The application is not locked to Ubuntu or Apache and can be hosted on
any platform that supports Ruby and any web server such as Nginx or Unicorn. In
addition to hosting the DisICC software, the application server hosts and uses three of the
prediction applications ConSEQ (as Rate4Site), CAPS and XDET.
The PostgreSQL database that is used as the backend storage for DisICC is hosted
on a separate server.
Rather than hosting the database on the same server as the
application (where both systems compete for resources), I have separated them for
performance and scalability. This separation allows multiple instances of the DisICC
application, either on one server or multiple servers, to connect to the same data-source.
This makes scaling as simple as bringing up another instance of DisICC. This design
choice was made to create an application that could scale for multiple users. Further, as
more and more research moves to the cloud, DisICC is in line to take advantage of the
developing resources such as Amazon EC2 and Heroku.
107
HTTP
DisICC Server
Webserver
Rack
DisEMBL
Rails
Database Server
DIsICC Models
PostgreSQL
Database
Devise
Auth
TP
HT
HTTP
HTTP
HT
TP
PONDR Fit
CORNET
DataMapper
RONN
File System
ConSEQ
CAPS
XDET
Figure 4.1: DisICC Application Organization. The DisICC application is organized onto
two separate machines: the webserver (DisICC Server) and the database server. Within
the webserver are the different components that allow DisICC to run as a web application
such as Rack middleware (orange) and ruby on Ruby on Rails (red). Within Rails are the
DisICC modules (yellow) that correspond to the DisICC Data objects that work with the
Authorization (green) and the DataMapper ORM layer (pink) to authenticate and talk to
the PostgreSQL database (blue cylinder) respectively. Additionally, Rails provides
access to the file system (grey cylinder). The different prediction methods are shown in
white boxes. DisEMB, PONDR Fit, CORNET and RONN are web services that DisICC
access via Ruby HTTP requests while ConSEQ, CAPS, and XDET are accessed using
Ruby system calls.
Database Schema
A relational database was chosen to meet the following requirements: storage of
data with efficient and normalized organization, granular access to the amino acid level,
parallel access, and performance.
This storage method was necessary as there are
1,370,416 data points derived from 1,512 prediction result files. These data points were
108
generated from the 171,302 individual amino acids of the 189 sequences. DisICC uses
PostgreSQL as the relational database backend not only for the method results, but also
for the protein sequences, alignments, and meta-data: this permits advanced queries, data
associations, and data provenance. Although database schemas currently exist for storing
sequence information, in the early development of DisICC, no such public schemas
existed capable of supporting my requirements. Hence, a DisICC specific schema was
constructed (Fig 6.2). This does not prevent DisICC information from being serialized to
a different schema as the use of BioRuby supports BioSQL and many alignment formats.
Additionally, the flexibility and utility of the DataMapper objects allows them to easily
produce XML, JSON and CSV versions.
Data Objects
DisICC was implemented in a way that supports the storage of the various
predictions, alignments and sequences in a database and also provides objects via the
DataMapper ORM. The concept of object-oriented programming (OOP) has been around
since the 1960’s and gained traction in the programming community in the 1990’s. The
Ruby language was constructed around OOP, which was one of the reasons it was chosen
for the described study.
The objects in DisICC map to database tables and contain attributes that map
directly to database fields (Fig 6.2). In addition to storing data, these objects also possess
methods.
109
Disorder Type
id :Serial
type :String
deleted_at: DateTime
has n :disorders
Disorder
id :Serial
disorder_type :String
* version :Integer
seq_id :Integer
threshold :Float
deleted_at: DateTime
has 1 :disorder_id
belongs_to :sequence
has n :disorder_values
1
Alignment
id :Serial
seq_id :Integer
name :String
align_order :Integer
alignment_sequence :Text
fasta_title :Text
deleted_at: DateTime
has 1 :sequence
has n :alignment_positions
has 1 :percent_id
*
1
1
*
1
1
1
1
*
Sequence
id :Serial
name :String
sequence :Text
type :String
accession :String
abrev :String
disorder_percent :Float
alternate_name :String
owner :Integer
deleted_at :DateTime
has n, a_asequences
has n, users
has n, disorders
has n, intra_residue_contacts
has n, caps
has n, xdets
has n, conseqs
has n, alignments
has n, percent_ids
*
1
id :Serial
first_name :String
last_name :String
login :String
has n :sequences
2
AASequence
id :Serial
seq_id :Integer
amino_acid :String
original_position :Integer
disorder_consensus :Float
contact_consensus :Float
contact_positive_consensus :Float
deleted_at: DateTime
belongs_to :sequence
has n, disorder_values
has 1, :xdet
has 1 :conseq
has n :caps
has n :intercaps
*
has n :intraresidue_contacts
has n :alignment_positions
1
1
1
1
1
*
1
*
User
*
*
1
*
2
1
*
Percent Identity
seq1_id :Integer
seq2_id :Integer
alignmnet_name :String
percent_id :Float
deleted_at: DateTime
has 1 :sequence
has 1 :alignment
*
*
1
*
Alignment Position
id :Serial
alignment_id :Integer
position :Integer
aasequence_id :Integer
deleted_at: DateTime
has 1 :alignment
has 1 :sequence
has 1 :aasequence
1
Inter Caps
id :Serial
seq1_id: Integer
seq2_id: Integer
aasequence1_id :Integer
aasequece2_id: :Integer
position_one :integer
position_two :integer
mean_one :Float
mean_two :Float
correlation :Float
deleted_at: DateTime
belongs_to :sequence
Disorder Value
id :Serial
disorder_id :Integer
aasequence_id :Integer
dvalue :Float
deleted_at: DateTime
belongs_to :disorder
belongs_to :aasequence
1
Xdet
id :Serial
aasequence_id :Integer
conservation :Float
correlation :Float
seq_id :Integer
login :String
belongs_to :sequence
belongs_to :aasequence
1
Conseq
id :Serial
seq_id :Integer
asequence_id :Integer
score :Float
color :integer
state :String
function :String
msa_data :String
residue_variety :String
deleted_at: DateTime
has 1 :sequence
belongs_to :aasequence
1
*
*
Caps
id :Serial
seq_id :Integer
aasequence_id :Integer
position_one :integer
position_two :integer
mean_one :Float
mean_two :Float
correlation :Float
deleted_at: DateTime
belongs_to :sequence
Intra Residue Contact
id :Serial
seq_id :Integer
first_residue :Integer
second_residue :Integer
confidence :Float
type :String
d1 :Integer
d2 :Integer
deleted_at: DateTime
has 1 :alignment
has 1 :sequence
has 1 :aasequence
Figure 4.2: DisICC Database and Object Schema. This UML diagram outlines the
organization for the database tables and Ruby objects that are used to store information
for the DisICC Pipeline. The lines represent associations between data objects. The
numbers correspond to the number of objects per instance, where * stands for many.
These methods provide a variety of functions from data conversion and import to
reporting and statistical calculations. Although these methods are invisible to a typical
DisICC user, they are available to investigators as part of the DisICC library.
Data Visualization
Running methods and gathering data is only the beginning in research and
investigation. Data analysis is the next step, and DisICC provides some tools to aid in
making this process easier in spotting trends. DisICC makes use of popular JavaScript
framework and visualization libraries.
The reason for the use of JavaScript for
visualization in DisICC was that the browser is the primary mode of interaction with the
110
software, and all browsers support JavaScript: this negates the need for installing
additional third-party plugins.
Figure 4.3 Parallel Coordinates sample graph of the P order results from DisICC.
The four axes correspond to the amino acid position in the MSA (Position), the position
consensus of disorder from 0-1 (Disorder), the consensus of CICP prediction 0-1 (CICP)
and the conservation score from the FABAT method. A lower score indicates better
conservation.
One of the basic visualizations DisICC provides is line graphs through jQuery
graph. These graphs are implemented for each disorder method, disorder consensus,
CICP consensus, inter-consensus predictions and conservation. The line graphs are
interactive for each data point displaying sequence position and result. Another powerful
tool for discovering trends across multiple variables, visually, is a parallel coordinates
111
graph. An implementation of the D3.js parallel coordinates graph (Fig 4.3) is provided
for looking at disorder consensus, CICP consensus, and conservation against each other.
This graph is fully interactive and allows subset selection of combined variables, column
rearrangement and data-grid integration.
Running the Pipeline
From the user interface, starting the DisICC pipeline to evaluate an alignment of
sequences is simple. A user can choose to upload the alignment and have the system
automatically run all the methods, or the user can upload the alignment and choose to run
only disorder or only CICP methods. Once the system has a FASTA alignment file
uploaded it parses that alignment into sequence objects and the corresponding amino acid
objects. These are then associated with alignment objects that represent each sequences
state in the alignment. Alignment position objects are then created and associated with
each amino acid sequence object (Fig 4.2). Once everything is normalized in the database
DisICC is primed to run the pipeline methods.
The disorder part of the pipeline only requires the sequence information stored in
the DisICC database. Each sequence is passed to all the disorder prediction methods
(IUPred, PONDR Fit, DisEMBL, and RONN) at once through process threads. The
threads allow the calls to the web services to occur simultaneously, and the results are
parsed in arbitrary order into the appropriate disorder and disorder value objects that
become associated with the sequence object and amino acid objects
All the CICP methods except CORNET require an MSA input. The constraints of
these inputs are sequences with percent identities less than 90% and greater than 19%.
112
DisICC provides a method to calculate the percent identities between all the sequences
and then generate a sub-alignment of sequences meeting the criteria. Each sequence in
the original input alignment has a sub-set alignment generated for it and these alignments
are passed to ConSEQ, CAPS and XDet. The results of each of these methods are parsed
in to different objects CORNET(intra_residue_contact), ConSEQ(conseq), CAPS (Caps)
and Xdet (xdet) that are associated with the corresponding sequence and amino acid; this
is due to the different result formats and thresholds indicating positive or negatives
results. After all methods are stored, DisICC can generate the CICP for each amino acid
by evaluating each method and assign a 1 for positive or 0 for negative result. These
normalized values are then averaged and stored into the amino acid as a consensus result.
Disorder and CICP alignment conservations are calculated upon display by
summing the consensus values for each amino acid at each alignment position and
averaging for the number of sequences in the alignment in the case of disorder and
averaging by the number of sequences that actually met CICP criteria for the CICP
conservation score. These results are then passed to the browser for display as annotated
MSAs, graphs and data grids.
Quality Control
No researcher likes a black box where input goes in and magical results come out.
To ensure that I and other investigators could be confident in the results the pipeline
produced, browsing of individual method results is supported. In addition to method
results users can also browse how inputs are organized that go into these methods
113
including the amino acid sequences, uploaded alignments and alignments generated from
percent identities.
Future Work
Looking ahead, I would like to add support for additional predictive methods and
likely replace some of the current methods with newer and more accurate ones. For
instance, CORNET could be replaced with SVMSEQ, although SVMSEQ does have a
limitation of 1500 amino acids for sequence evaluation. The addition of inter-protein
predictive methods would also strengthen the predictions produced by DisICC and add
another dimension of data that would be useful to myself and other investigators.
Additionally, data visualization and analysis tools to support this research need to be
enhanced. In line with the manner that I developed a visualization for making disorder
and CICP results easier to evaluate, I would like to continue to expand these features and
add additional visualizations to the DisICC UI to aide investigators in more rapid
analysis.
Availability
The source code is available on github at https://github.com/scleveland/DisICC.
A demo of the software can be found at http://bioline.rcg.montana.edu/.
114
GENERAL CONCLUSION
Summary of the Study
This dissertation presents the research results obtained from combining multiple
methods of protein sequence analysis: including disorder, intra-residue contacts,
conservation, evolutionary dynamics, and co-evolution predictions into a pipeline. This
pipeline allows the rapid correlation of results for identifying protein interaction regions
in a subset of the sequence alignment space. These subset predictions can be use to infer
conserved features for the larger alignment sequence space. This approach has been
condensed into the Disorder, Intra-residue contact and Compensatory mutation Correlator
(DisICC) pipeline for general use. The concept of using these metrics for binding region
identification is not new, but combining them together is a novel and robust approach to
inferring important amino acid residues. The sequence space in this study covers 63
representative viruses and the three transcription/replication complex proteins N, L and P,
totaling 189 sequences. The lack of structural information for many of the 63 viruses
allows validation of a new structure-independent approach, as well as providing valuable
information for this important viral Order. In addition to developing a new analytical
pipeline, a method for storing, retrieving, and querying the information was constructed
along with supporting object libraries that allow better access and analysis of these data
(Fig 4.1, 4.2). These libraries enable better reuse of the software and rapid addition of
additional in silico methods.
115
Nucleoprotein Conclusions
The N protein analysis in Chapter 2 shows the carboxyl-terminal region for the
Order is predicted to be a disordered binding region by DisICC.
This prediction
corroborated experimental results from SENV[9], NCDV[53] and MeV[54] [55] that
show this region to be involved in binding the phosphoprotein. Additional validation of
this region was achieved through experimental data showing the RABV N-RNA rings
had bound phosphoproteins on the tips of the rings [59]. I further used crystal structure
data of the VSV N:RNA & P complex [22] to examine the mapping of the predictions to
the identified binding regions in the nucleoprotein (Fig 2.7). The results further validate
that the predicted disordered region in the carboxyl-terminus is bound to the
phosphoprotein.
With the strong experimental validation of the DisICC results from
Paramyxoviridae and Rhabdoviridae, I can infer that these predicated phosphoprotein
binding regions are conserved for all members of the Order.
L Polymerase Conclusions
The L protein analysis in Chapter 3 shows matching binding region criteria that
spanned the Order in both the amino-terminal region and a portion of Domain V (Fig 3.1,
3.2). The amino-terminal region binding prediction is experimentally validated as the LL complex region from studies of SENV [84,85]. Two factors suggest that the DisICC
results can be applied to all members of the Order: the conservation of all predications
between the families, and the homology the L polymerase has across the entire Order.
The binding region predicted in Domain V is validated by the experimental evidence of
the conserved capping motifs presence in this region. These motifs have been shown to
116
be conserved across the Order [86]. Within the Paramyxoviridae and Rhabdoviridae a
binding region is predicted in Domain VI (Fig 3.1A, Fig 3.1B, Fig 3.2). This predicted
binding region is experimentally validated by the presence of the conserved motifs II and
III of the methyltransferase that has been previously shown in VSV, BEFV, REBOV,
RABV, HRSV, MeV and SENV [76,87]. This validation allows the inference of these
binding regions to all the other members of Paramyxovirdae and Rhabdoviridae from this
study.
These additional validations of the binding predication results from DisICC
provides further evidence for the pipeline method in identifying binding regions.
Phosphoprotein Conclusions
The L polymerase and nucleoprotein results allow a high degree of inference in
contrast, the phosphoprotein analysis in Chapter 3 reveals binding region criteria that can
only be applied within each viral family. Due to the significant divergence between
families, high rates of evolution, and differences in domain organization, comparisons of
phosphoproteins across families could not be justified.
Also due to the level of
divergence within families, only the Paramyxovirinae sample sequences were eligible for
submission to the intra-residue and co-evolution prediction methods. DisICC predicted
Paramyxoviridae to have binding regions in the amino-terminal: this predicted region is
validated by experimental results from SENV [88] and Rinderpest virus [89] as the
location for the oligomerization domain, allowing the inference of this binding region for
all the Paramyxoviridae viruses.
117
Conclusion
This dissertation, through the use of the DisICC pipeline, expanded the body of
knowledge about the replication transcription complex and the role disorder and intraresidue contact and evolution play within the three proteins (N, L and P). These results
provide additional insight and possible anti-viral targets for important human pathogens.
The successful prediction of binding regions for the three proteins (N, L and P) and the
validation by both experimental and structural studies show the utility of the DisICC
pipeline for future work. Future plans for DisICC adding additional methods to improve
utility and integrating with other community resources to increase adoption.
In summary, the DisICC pipeline and complimentary software tools provide a
number of useful methods for investigators in a user-friendly and powerful package for
rapid sequence analysis. The resulting analyses can provide insight for binding regions,
evolutionarily conserved structural/functional features and flexible regions, even in
proteins with little to no direct structural information or indirect (threaded) models.
118
REFERENCES CITED
119
1.
Horie M, Honda T, Suzuki Y, Kobayashi Y, Daito T, et al. (2010) Endogenous
non-retroviral RNA virus elements in mammalian genomes. Nature 463: 84–87.
doi:10.1038/nature08695.
2.
Becker S, Huppertz S, Klenk HD, Feldmann H (1994) The nucleoprotein of
Marburg virus is phosphorylated. J Gen Virol 75 ( Pt 4): 809–818.
3.
Watanabe S, Noda T, Kawaoka Y (2006) Functional mapping of the nucleoprotein
of Ebola virus. J Virol 80: 3743.
4.
Chuang JL, Perrault J (1997) Initiation of vesicular stomatitis virus mutant polR1
transcription internally at the N gene in vitro. J Virol 71: 1466–1475.
5.
Lichty BD, Power AT, Stojdl DF, Bell JC (2004) Vesicular stomatitis virus: reinventing the bullet. Trends Mol Med 10: 210–216.
doi:10.1016/j.molmed.2004.03.003.
6.
Johnson JE, Coleman JW, Kalyan NK, Calderon P, Wright KJ, et al. (2009) In
vivo biodistribution of a highly attenuated recombinant vesicular stomatitis virus
expressing HIV-1 Gag following intramuscular, intranasal, or intravenous
inoculation. Vaccine 27: 2930–2939. doi:10.1016/j.vaccine.2009.03.006.
7.
Koser ML, McGettigan JP, Tan GS, Smith ME, Koprowski H, et al. (2004) Rabies
virus nucleoprotein as a carrier for foreign antigens. Proc Natl Acad Sci USA 101:
9405–9410. doi:10.1073/pnas.0403060101.
8.
Whelan SPJ, Barr JN, Wertz GW (2004) Transcription and replication of
nonsegmented negative-strand RNA viruses. Curr Top Microbiol Immunol 283:
61–119.
9.
Cevik B, Kaesberg J, Smallwood S, Feller JA, Moyer SA (2004) Mapping the
phosphoprotein binding site on Sendai virus NP protein assembled into
nucleocapsids. Virology 325: 216–224. doi:10.1016/j.virol.2004.05.012.
10.
Chuang JL, Jackson RL, Perrault J (1997) Isolation and Characterization of
Vesicular Stomatitis Virus PolR Revertants: Polymerase Readthrough of the
Leader–N Gene Junction Is Linked to an ATP-Dependent Function. Virology 229:
57–67. doi:10.1006/viro.1996.8418.
11.
Murphy LB, Loney C, Murray J, Bhella D, Ashton P, et al. (2003) Investigations
into the amino-terminal domain of the respiratory syncytial virus nucleocapsid
protein reveal elements important for nucleocapsid formation and interaction with
the phosphoprotein. Virology 307: 143–153.
12.
Bode L, Dürrwald R, Ludwig H (1994) Borna virus infections in cattle associated
120
with fatal neurological disease. Vet Rec 135: 283–284.
13.
Briese T, Briese T, Schneemann A, Schneemann A, Lewis AJ, et al. (1994)
Genomic organization of Borna disease virus. Proc Natl Acad Sci USA 91: 4362–
4366.
14.
Cubitt B, Cubitt B, Oldstone C, Oldstone C, la Torre de JC, et al. (1994) Sequence
and genome organization of Borna disease virus. J Virol 68: 1382–1396.
15.
Lundgren AL, Lundgren AL, Zimmermann W, Zimmermann W, Bode L, et al.
(1995) Staggering disease in cats: isolation and characterization of the feline Borna
disease virus. J Gen Virol 76 ( Pt 9): 2215–2222.
16.
la Torre de JC (1994) Molecular biology of borna disease virus: prototype of a new
group of animal viruses. J Virol 68: 7669–7675.
17.
Schneemann A, Schneemann A, Schneider PA, Schneider PA, Lamb RA, et al.
(1995) The remarkable coding strategy of borna disease virus: a new member of
the nonsegmented negative strand RNA viruses. Virology 210: 1–8.
doi:10.1006/viro.1995.1311.
18.
Kistler AL, Kistler AL, Gancz A, Gancz A, Clubb S, et al. (2008) Recovery of
divergent avian bornaviruses from cases of proventricular dilatation disease:
identification of a candidate etiologic agent. Virol J 5: 88. doi:10.1186/1743422X-5-88.
19.
Honkavuori KS, Honkavuori KS, Shivaprasad HL, Shivaprasad HL, Williams BL,
et al. (2008) Novel borna virus in psittacine birds with proventricular dilatation
disease. Emerging Infect Dis 14: 1883–1886. doi:10.3201/eid1412.080984.
20.
Staeheli P, Staeheli P, Rinder M, Rinder M, Kaspers B, et al. (2010) Avian
bornavirus associated with fatal disease in psittacine birds. J Virol 84: 6269–6275.
doi:10.1128/JVI.02567-09.
21.
Nakaya T, Takahashi H, Nakamura Y, Asahi S, Tobiume M, et al. (1996)
Demonstration of Borna disease virus RNA in peripheral blood mononuclear cells
derived from Japanese patients with chronic fatigue syndrome. FEBS Lett 378:
145–149.
22.
Kobayashi T, Kobayashi T, Zhang G, Zhang G, Lee B-J, et al. (2003) Modulation
of Borna disease virus phosphoprotein nuclear localization by the viral protein X
encoded in the overlapping open reading frame. J Virol 77: 8099–8107.
23.
la Torre de JC (2002) Molecular biology of Borna disease virus and persistence.
Front Biosci 7: d569–d579.
121
24.
Schneider U (2005) Novel insights into the regulation of the viral polymerase
complex of neurotropic Borna disease virus. Virus Research 111: 148–160.
doi:10.1016/j.virusres.2005.04.006.
25.
Poenisch M, Wille S, Staeheli P, Schneider U (2008) Polymerase read-through at
the first transcription termination site contributes to regulation of borna disease
virus gene expression. J Virol 82: 9537–9545. doi:10.1128/JVI.00639-08.
26.
Poenisch M, Staeheli P, Schneider U (2008) Viral accessory protein X stimulates
the assembly of functional Borna disease virus polymerase complexes. J Gen Virol
89: 1442–1445. doi:10.1099/vir.0.2008/000638-0.
27.
Poenisch M, Unterstab G, Wolff T, Staeheli P, Schneider U (2004) The X protein
of Borna disease virus regulates viral polymerase activity through interaction with
the P protein. J Gen Virol 85: 1895–1898. doi:10.1099/vir.0.80002-0.
28.
Poenisch M, Wille S, Ackermann A, Staeheli P, Schneider U (2007) The X protein
of borna disease virus serves essential functions in the viral multiplication cycle. J
Virol 81: 7297–7299. doi:10.1128/JVI.02468-06.
29.
Hosaka Y, Kitano H, Ikeguchi S (1966) Studies on the pleomorphism of HVJ
virons. Virology 29: 205–221.
30.
Klenk HD, Choppin PW (1969) Chemical composition of the parainfluenza virus
SV5. Virology 37: 155–157.
31.
Caliguiri LA, Klenk HD, Choppin PW (1969) The proteins of the parainfluenza
virus SV5. 1. Separation of virion polypeptides by polyacrylamide gel
electrophoresis. Virology 39: 460–466.
32.
Compans RW, Klenk HD, Caliguiri LA, Choppin PW (1970) Influenza virus
proteins. I. Analysis of polypeptides of the virion and identification of spike
glycoproteins. Virology 42: 880–889.
33.
Takada A, Robison C, Goto H, Sanchez A, Murti KG, et al. (1997) A system for
functional analysis of Ebola virus glycoprotein. Proc Natl Acad Sci USA 94:
14764–14769.
34.
Mahy BWJ (2010) The Evolution and Emergence of RNA Viruses. Emerging
Infect Dis 16: 899–899. doi:10.3201/eid1605.100164.
35.
Suzuki Y, Gojobori T (1997) The origin and evolution of Ebola and Marburg
viruses. Mol Biol Evol 14: 800–806.
36.
Jahrling PB, Geisbert TW, Geisbert JB, Swearengen JR, Bray M, et al. (1999)
Evaluation of Immune Globulin and Recombinant Interferon‐α2b for Treatment of
122
Experimental Ebola Virus Infections. J Infect Dis 179: S224–S234.
doi:10.1086/514310.
37.
Perrault J, McLear PW (1984) ATP dependence of vesicular stomatitis virus
transcription initiation and modulation by mutation in the nucleocapsid protein. J
Virol 51: 635–642.
38.
IRIE T, LICATA J, HARTY R (2005) Functional characterization of Ebola virus
L-domains using VSV recombinants. Virology 336: 291–298.
doi:10.1016/j.virol.2005.03.027.
39.
Scherer CFC, O'Donnell V, Golde WT, Gregg D, Mark Estes D, et al. (2007)
Vesicular stomatitis New Jersey virus (VSNJV) infects keratinocytes and is
restricted to lesion sites and local lymph nodes in the bovine, a natural host. Vet
Res 38: 375–390. doi:10.1051/vetres:2007001.
40.
Rainwater-Lovett K, Pauszek SJ, Kelley WN, Rodriguez LL (2007) Molecular
epidemiology of vesicular stomatitis New Jersey virus from the 2004-2005 US
outbreak indicates a common origin with Mexican strains. Journal of General
Virology 88: 2042–2051. doi:10.1099/vir.0.82644-0.
41.
Letchworth GJ, Rodriguez LL, Del cbarrera J (1999) Vesicular stomatitis. Vet J
157: 239–260. doi:10.1053/tvjl.1998.0303.
42.
Thomas D, Newcomb WW, Brown JC, Wall JS, Hainfeld JF, et al. (1985) Mass
and molecular composition of vesicular stomatitis virus: a scanning transmission
electron microscopy analysis. J Virol 54(20: 598-607
43.
Moyer SA, Smallwood-Kentro S, Haddad A, Prevec L (1991) Assembly and
transcription of synthetic vesicular stomatitis virus nucleocapsids. J Virol 65:
2170–2178.
44.
Schubert M, Harmison GG, Richardson CD, Meier E (1985) Expression of a
cDNA encoding a functional 241-kilodalton vesicular stomatitis virus RNA
polymerase. Proc Natl Acad Sci USA 82: 7984–7988.
45.
Green TJ, Macpherson S, Qiu S, Lebowitz J, Wertz GW, et al. (2000) Study of the
assembly of vesicular stomatitis virus N protein: role of the P protein. J Virol 74:
9515–9524.
46.
Howard M, Wertz G (1989) Vesicular stomatitis virus RNA replication: a role for
the NS protein. J Gen Virol 70 ( Pt 10): 2683–2694.
47.
Takacs AM, Das T, Banerjee AK (1993) Mapping of interacting domains between
the nucleocapsid protein and the phosphoprotein of vesicular stomatitis virus by
using a two-hybrid system. Proc Natl Acad Sci USA 90: 10375–10379.
123
48.
La Ferla FM, Peluso RW (1989) The 1: 1 N-NS protein complex of vesicular
stomatitis virus is essential for efficient genome replication. J Virol 63: 3852.
49.
Green TJ (2006) Structure of the Vesicular Stomatitis Virus Nucleoprotein-RNA
Complex. Science 313: 357–360. doi:10.1126/science.1126953.
50.
ZHANG X, Green TJ, Tsao J, Qiu S, Luo M (2008) Role of Intermolecular
Interactions of Vesicular Stomatitis Virus Nucleoprotein in RNA Encapsidation. J
Virol 82: 674–682. doi:10.1128/JVI.00935-07.
51.
Finke S, Brzózka K, Conzelmann K-K (2004) Tracking fluorescence-labeled
rabies virus: enhanced green fluorescent protein-tagged phosphoprotein P supports
virus gene expression and formation of infectious particles. J Virol 78: 12333–
12343. doi:10.1128/JVI.78.22.12333-12343.2004.
52.
Emerson SU, Schubert M (1987) Location of the binding domains for the RNA
polymerase L and the ribonucleocapsid template within different halves of the NS
phosphoprotein of vesicular stomatitis virus. Proc Natl Acad Sci USA 84: 5655–
5659.
53.
Chen M, Ogino T (2006) Mapping and functional role of the self-association
domain of vesicular stomatitis virus phosphoprotein. J Virol 80(19): 9511-9518
54.
Ding H, Green T, Lu S (2006) Crystal structure of the oligomerization domain of
the phosphoprotein of vesicular stomatitis virus. J Virol. 80(6): 2808-2814
55.
Chen JL, Das T, Banerjee AK (1997) Phosphorylated states of vesicular stomatitis
virus P protein in vitro and in vivo. Virology 228: 200–212.
doi:10.1006/viro.1996.8401.
56.
Paul PR, Chattopadhyay D, Banerjee AK (1988) The functional domains of the
phosphoprotein (NS) of vesicular stomatitis virus (Indiana serotype). Virology
166: 350–357.
57.
Pattnaik AK, Hwang L, Li T, Englund N, Mathur M, et al. (1997) Phosphorylation
within the amino-terminal acidic domain I of the phosphoprotein of vesicular
stomatitis virus is required for transcription but not for replication. J Virol 71(11):
8167-8175
58.
Hwang LN, Englund N, Das T, Banerjee AK, Pattnaik AK (1999) Optimal
replication activity of vesicular stomatitis virus RNA polymerase requires
phosphorylation of a residue(s) at carboxy-terminal domain II of its accessory
subunit, phosphoprotein P. J Virol 73: 5613–5620.
59.
Canter D (1996) Stabilization of Vesicular Stomatitis Virus L Polymerase Protein
by P Protein Binding: A Small Deletion in the C-Terminal Domain of L Abrogates
124
Binding. Virology 219: 376–386. doi:10.1006/viro.1996.0263.
60.
Poch O, Blumberg BM, Bougueleret L, Tordo N (1990) Sequence comparison of
five polymerases (L proteins) of unsegmented negative-strand RNA viruses:
theoretical assignment of functional domains. J Gen Virol 71 ( Pt 5): 1153–1162.
61.
Smallwood S, Easson CD, Feller JA, Horikami SM, Moyer SA (1999) Mutations
in Conserved Domain II of the Large (L) Subunit of the Sendai Virus RNA
Polymerase Abolish RNA Synthesis. Virology 262: 375–383.
doi:10.1006/viro.1999.9933.
62.
Schnell MJ, Conzelmann KK (1995) Polymerase activity of in vitro mutated rabies
virus L protein. Virology 214: 522–530. doi:10.1006/viro.1995.0063.
63.
Canter D, Jackson R, Perrault J (1996) Constitutive phosphorylation of the
vesicular stomatitis virus P protein modulates polymerase complex formation but
is not essential for transcription or replication. J Virol 70(7): 4538-4548
64.
Rahmeh AA, Schenk AD, Danek EI, Kranzusch PJ, Liang B, et al. (2010)
Molecular architecture of the vesicular stomatitis virus RNA polymerase. Proc
Natl Acad Sci USA 107: 20075–20080. doi:10.1073/pnas.1013559107.
65.
Dunker AK, Silman I, Uversky VN, Sussman JL (2008) Function and structure of
inherently disordered proteins. Curr Opin Struct Biol 18: 756–764.
doi:10.1016/j.sbi.2008.10.002.
66.
Ferron F, Longhi S, Canard B, Karlin D (2006) A practical overview of protein
disorder prediction methods. Proteins 65: 1–14. doi:10.1002/prot.21075.
67.
Gunasekaran K, Tsai C-J, Kumar S, Zanuy D, Nussinov R (2003) Extended
disordered proteins: targeting function with less scaffold. Trends Biochem Sci 28:
81–85. doi:10.1016/S0968-0004(03)00003-3.
68.
Dosztányi Z, Csizmok V, Tompa P, Simon I (2005) IUPred: web server for the
prediction of intrinsically unstructured regions of proteins based on estimated
energy content. Bioinformatics 21: 3433–3434. doi:10.1093/bioinformatics/bti541.
69.
Yang ZR, Thomson R, McNeil P, Esnouf RM (2005) RONN: the bio-basis
function neural network technique applied to the detection of natively disordered
regions in proteins. Bioinformatics 21: 3369–3376.
doi:10.1093/bioinformatics/bti534.
70.
Thomson R, Hodgman TC, Yang ZR, Doyle AK (2003) Characterizing proteolytic
cleavage site activity using bio-basis function neural networks. Bioinformatics 19:
1741–1747. doi:10.1093/bioinformatics/btg237.
125
71.
Thomson R, Esnouf R (2004) Prediction of Natively Disordered Regions in
Proteins Using a Bio-basis Function Neural Network. Lecture Notes in Computer
Science. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer Berlin
Heidelberg, Vol. 3177. pp. 108–116. doi:10.1007/978-3-540-28651-6_16.
72.
Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, et al. (2003) Protein disorder
prediction: implications for structural proteomics. Structure 11: 1453–1459.
73.
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern
recognition of hydrogen-bonded and geometrical features. Biopolymers 22: 2577–
2637. doi:10.1002/bip.360221211.
74.
Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, et al. (2001) Sequence
complexity of disordered protein. Proteins 42: 38–48.
75.
Li X, Romero P, Rani M, Dunker A, Obradovic Z (1999) Predicting Protein
Disorder for N-, C-, and Internal Regions. Genome Inform Ser Workshop Genome
Inform 10: 30–40.
76.
Ferron F, Longhi S, Henrissat B, Canard B (2002) Viral RNA-polymerases-a
predicted 2'-O-ribose methyltransferase domain shared by all Mononegavirales.
Trends Biochem Sci 27: 222–224.
77.
Olmea O, Valencia A (1997) Improving contact predictions by the combination of
correlated mutations and other sources of sequence information. Folding and
Design 2: S25–S32. doi:10.1016/S1359-0278(97)00060-6.
78.
Bujnicki JM, Rychlewski L (2002) In silico identification, structure prediction and
phylogenetic analysis of the 2'-O-ribose (cap 1) methyltransferase domain in the
large structural protein of ssRNA negative-strand viruses. Protein Eng 15: 101–
108.
79.
Fariselli P, Casadio R (1999) A neural network based predictor of residue contacts
in proteins. Protein Eng 12: 15–21.
80.
Rumelhart DE, Hintont GE, Williams RJ (1986) Learning representations by backpropagating errors. Nature 323: 533-536
81.
Mayrose I, Graur D, Ben-Tal N, Pupko T (2004) Comparison of site-specific rateinference methods for protein sequences: empirical Bayesian methods are superior.
Mol Biol Evol 21: 1781–1791. doi:10.1093/molbev/msh194.
82.
Fares MA, McNally D (2006) CAPS: coevolution analysis using protein
sequences. Bioinformatics 22: 2821–2822. doi:10.1093/bioinformatics/btl493.
83.
Westfall PH, Young SS (1993) Resampling-based multiple testing: Examples and
126
methods for p-value adjustment. Wiley, New York
84.
Cevik B, Smallwood S, Moyer SA (2003) The L-L oligomerization domain resides
at the very N-terminus of the sendai virus L RNA polymerase protein. Virology
313: 525–536.
85.
Cevik B, Smallwood S, Moyer SA (2007) Two N-terminal regions of the Sendai
virus L RNA polymerase protein participate in oligomerization. Virology 363:
189–197. doi:10.1016/j.virol.2007.01.032.
86.
Li J, Rahmeh A, Morelli M, Whelan SPJ (2008) A Conserved Motif in Region V
of the Large Polymerase Proteins of Nonsegmented Negative-Sense RNA Viruses
That Is Essential for mRNA Capping. J Virol 82(2): 775-784
87.
Li J, Fontaine-Rodriguez EC, Whelan SPJ (2005) Amino Acid Residues within
Conserved Domain VI of the Vesicular Stomatitis Virus Large Polymerase Protein
Essential for mRNA Cap Methyltransferase Activity.
88.
CURRAN J, BOECK R, LIN-MARQ N, LUPAS A, Kolakofsky D (1995)
Paramyxovirus Phosphoproteins Form Homotrimers as Determined by an Epitope
Dilution Assay, via Predicted Coiled Coils. Virology 214: 139–149.
doi:10.1006/viro.1995.9946.
89.
Rahaman A, Srinivasan N, Shamala N, Shaila MS (2004) Phosphoprotein of the
rinderpest virus forms a tetramer through a coiled coil region important for
biological function. A structural insight. J Biol Chem 279: 23606–23614.
doi:10.1074/jbc.M400673200.
127
APPENDICES
128
APPENDIX A
SUPPLEMENTARY TABLE 2.1
129
Supplementary Table 2.1 List of predicted Disordered and CICP residues for each viruses
N protein. The numbers in the Disorder Regions and CICP Regions columns correspond
to the unaligned residue position(s) of each sequence. A.) Paramyxoviridae B.)
Rhabdoviridae C.) Filoviridae D.) Bornaviridae. The table columns are: Sequence the abbreviated name or the virus (see Methods), Disordered Regions - the location of the
disordered residues corresponding to sequence position, # Disordered Residues - the total
number of disordered amino acids in the sequence, % of Sequence Disordered - the
percentage of disordered residues in the sequence, CICP Regions – the location of the
CICP residues corresponding to the sequence position, # CICPs- the total number of
CICPS for the sequence, % of Sequence CICPs - the percentage of CICP positive
residues in the sequence, Disordered and CICP – residue position that are positive for
both CICP and disorder in the sequence, # Both - the total number of residues that are
both disordered and a CICP in the sequence, % Both - the percentage of residues that are
both disordered and a CICP in the sequence
A. Paramyxoviridae Nucleoproteins
Sequence
Disordered Regions
# Disordered
Residues
% of Sequence
Disordered
HMPNV
62
15
0
% of
Sequence
CICPs
0
0
0
AVPNV
72
18
0
0
0
0
30
7
0
0
0
0
38
9
0
0
0
0
30
7
0
0
0
0
38
9
0
0
0
0
25
6
0
0
0
0
35
8
0
0
0
0
34
8
0
0
0
0
192
34
58
10
,101,104,105,1
06,148,385,38
7
7
1
58
11
13
2
57
10
,101,104,105,1
06,107,115,11
6,120,205,207,
385,387,516
,35,148,385
3
0
49
9
,97,98,148,385
4
0
50
9
,93,98,384
3
0
53
10
11
2
47
10
,98,101,104,10
5,106,148,205,
210,385,387,5
04
,148,377
2
0
72
14
,375,383,385
3
0
67
13
,375,383,385
3
0
58
10
,38,40,98
3
0
51
10
0
0
73
13
3
0
1, 5, 29-37, 140-155, 191-201,
297-303, 370, 379-394
1, 28-40, 134-156, 193-201,
297-303, 367-370, 380-394
HRSVB1 1, 26, 28-33, 148-151, 334-338,
379-391
HRSVA2
1-2, 25-35, 99, 101-104, 148151, 334-338, 381-391
HRSVS2
1-2, 26, 28-34, 148-151, 334338, 381-391
RSV
1-2, 25-35, 99, 101-104, 148151, 334-338, 381-391
BRSV
122, 125, 148-151, 193-194,
334-338, 380-391
PNVM15
1-6, 138-150, 190, 192-193,
381-393
PNVMJ3666 1-6, 138-149, 190, 192-193,
381-393
MuV
1, 16-29, 89-95, 98-106, 108,
138-152, 379-389, 405-470,
482-549
TIOV
18-29, 99-127, 142-158, 186197, 201-207, 371-474, 477-478,
480-511, 514-517, 519-519
220
42
MENV
16-32, 34-39, 123-129, 133-135,
140-156, 186-196, 372-410, 412,
418-470, 485-512, 517-519
185
35
SPIV41
1, 16-28, 90-99, 141-145, 148149, 151-152, 373-388, 405-418,
420, 447-501, 520-543
1, 16-30, 90-100, 139-145, 147152, 194-195, 372-388, 401-439,
445-503, 516-542
1, 16-30, 89-110, 142-152, 196198, 200-210, 375-389, 399-434,
450-484, 486-487, 495, 497-509
143
26
184
33
165
32
AVPMV6
1, 18-28, 133-154, 181-194,
245-246, 372-389, 391, 399-465
136
29
GPV
1, 15-28, 115-116, 144-158,
184-191, 193-200, 242-245,
372-387, 398-444, 458-489
147
30
NCDV
1, 15-27, 112-116, 143-158,
184-198, 243, 373-386, 398-446,
457-489
147
30
TUPV
1, 37-48, 93-98, 139-152, 376389, 424-500, 521-547, 551-552
153
27
FDLV
110, 114, 178-196, 402-431,
441-445, 447-464, 466, 471-471
76
16
NIPH
22-23, 109, 111-124, 132-147,
182-193, 380, 395-409, 420-447,
455-473, 489-529, 532-532
150
28
HPIV2
SPIV5
CICP Regions
35, 41, 74, 78, 101, 104-107, 115-116, 148, 171-172, 205, 207, 210,
224, 230, 250, 252, 254-255, 258, 266, 271-273, 275-278, 287, 300,
304, 313-317, 325, 332-333, 335, 338-339, 341-344, 348, 351, 353,
355, 357, 363, 385, 387,
35, 41, 74, 78, 98, 101, 104-107, 115-116, 120, 171-172, 205, 207,
210, 224, 230, 234, 251, 255, 258, 266, 269, 271-273, 275-276, 278,
287, 300, 304, 313-317, 325, 332-333, 335, 338-339, 341-344, 348,
351, 355, 357, 363, 385, 387, 516,
35, 40-41, 59, 74, 76, 78, 98, 101, 104-107, 115-116, 148, 171, 205,
207, 210, 224, 230, 234, 255, 258, 266, 271-273, 275-278, 287, 300,
304, 313, 315-317, 325, 332-333, 335, 338-339, 342-344, 348, 351,
353, 355, 357, 363, 385, 516,
35, 41, 74, 97-98, 101, 104-107, 115-116, 148, 205, 207, 210, 228,
230, 250, 254-255, 258, 266, 271-273, 275, 278, 287, 300, 313, 315317, 325, 333, 335, 338-339, 341-344, 348, 351, 355, 357, 363, 385,
35, 40-41, 74, 78, 93, 98, 101, 104-107, 115-116, 204, 206, 209, 219,
229, 257, 265-266, 269-272, 274, 277, 279, 284, 286, 310, 312-317,
324, 334, 337-338, 341-343, 347, 350, 356, 362, 384,
35, 41, 74, 98, 101, 104-106, 115-116, 118, 148, 171-172, 205, 210,
224, 230, 254-255, 258, 266, 271-273, 275-276, 278, 287, 300, 313317, 325, 333, 335, 338-339, 341-344, 348, 351, 353, 355, 357, 363,
385, 387, 504,
38, 41, 74, 80, 101, 104-106, 115, 117-118, 148, 209, 220, 234, 255,
258, 262, 266-267, 272-273, 276, 278, 285, 287, 311-315, 317-318,
325, 330, 333, 338-339, 341-343, 348, 351, 355, 358, 363, 377,
36, 38, 40, 57, 75, 77, 90-91, 98-99, 102, 174-175, 181, 183, 203, 207208, 218, 228, 232, 249, 256, 259-260, 265-266, 268-274, 276, 283,
285, 295, 298, 309, 312-313, 315-316, 323-324, 328, 331-342, 346,
349-353, 355-356, 361, 363, 375, 383, 385,
38, 40, 75, 77, 91, 98-99, 102, 175, 181, 183, 203, 207, 218, 222, 228,
232, 249, 253, 256, 259-260, 265-266, 268-269, 272-274, 276, 285,
295, 298, 309-313, 315-316, 323-324, 328, 331-342, 346, 349-351,
353, 355-356, 361, 363, 375, 383, 385,
27-28, 30-31, 36, 38, 40, 72, 91, 98-99, 102, 113-114, 116, 169, 181,
203, 207-208, 210, 218, 256, 259-260, 264-265, 268-269, 272-274,
276, 285, 302, 306, 309-313, 315-316, 323, 331-338, 340-342, 346,
355, 361,
27, 30, 36, 38, 77, 98-100, 102-103, 112, 115, 165, 176, 198, 202, 248,
255, 260, 264, 266-269, 271, 278, 280, 306-311, 318, 326-327, 330337, 341-342, 344, 350-351, 356, 378,
38, 40, 72, 75, 77-78, 98-99, 102, 113-115, 151, 176-178, 181, 203,
207-208, 218, 222, 228, 232, 249, 253, 256, 259-260, 264-266, 268270, 272-274, 276, 283, 285, 295, 298, 302, 309-313, 315-316, 323,
328, 331-342, 346, 348-350, 353, 355, 361, 375,
#
CICPs
Disordered
and CICP
,113,114,115
# Both % Both
130
HV
1, 22-23, 112-123, 132-147,
182-193, 395-408, 418-475,
488-530, 532-532
159
29
MOSV
19-23, 130-133, 135-155, 184194, 377-383, 426-471, 476-528
147
27
BEIV
1-7, 12, 15-16, 116-139, 187201, 239, 372-385, 401-522
186
35
JV
15-16, 18-19, 110-143, 186-192,
194-197, 382-385, 395-485,
488-522
179
34
CDV
19-26, 395-451, 454-523
135
25
PDV
405-412, 428-435, 480-486,
507-523
40
7
DMV
22, 113-119, 127-135, 142-149,
190-194, 197, 199-204, 209-211,
376-377, 402-413, 419-488,
502-514, 517-523
15-31, 127-135, 138, 158, 209211, 395-412, 418-489, 502-525
144
27
145
27
MeV
15-28, 111-160, 203-204, 207211, 376-393, 404-408, 417-490,
503-525
191
36
RPV
1, 16-27, 61-64, 113-119, 127136, 208-214, 375-389, 399-492,
508-525
168
32
HPV1
1, 21-30, 110-121, 376-389,
401-413, 436-445, 458-520,
522-524
1, 20-29, 111-121, 377-388,
402-414, 419-447, 460-479,
489-524
18-25, 144-145, 147, 371-383,
404-515
126
24
132
25
136
26
1, 19-23, 145, 147-148, 372-392,
394-446, 448-515
151
29
PDPRV
SENV
BPV3
HPV3
30, 36, 38, 40, 72, 75, 77, 98-99, 102, 113-116, 146, 151, 176-178,
181, 203, 207-208, 218, 222, 228, 232, 249, 256, 259-260, 264-266,
268-270, 272-274, 276, 282-283, 285, 295, 298, 302, 309, 312-313,
315-316, 323, 328, 331-342, 346, 348-350, 353, 355, 361, 375,
28, 30, 36, 40, 78, 98, 102, 107, 113, 116, 151, 181, 203, 205, 207208, 218, 222, 228, 232, 249, 256, 259-260, 264-266, 269-274, 276,
285, 298, 309-313, 315-316, 323, 328-329, 331-338, 340-342, 346,
349-350, 355-356, 361, 363, 375, 383,
28, 36, 40-41, 87, 98-99, 102, 113-115, 118, 169, 181, 203, 205, 207209, 228, 256, 259-260, 264-265, 269-274, 276, 285, 295, 298, 302,
309-313, 315-316, 323, 331-338, 340-342, 346, 352, 355-356, 361,
383,
35, 74, 78, 98, 101, 104-107, 113, 115-116, 170, 203, 205, 207-208,
211, 218, 232, 249, 253, 256, 260, 264-266, 269-271, 273-274, 276,
283, 285, 298, 309-311, 313-315, 323, 328, 331, 333, 336-337, 339342, 346, 349-350, 353, 355, 361, 375, 383,
28, 36, 38, 40, 59, 90-91, 98, 102, 115, 117, 119-120, 171, 176-177,
183, 185, 205, 207-211, 220, 224, 230, 234, 251, 255, 258, 262, 266267, 270-276, 278, 287, 314-318, 325, 330, 333-338, 340, 342-344,
348, 351-353, 357-358, 363, 377, 385,
28, 30, 36, 38, 40, 59, 72, 75, 91, 98-99, 102, 114-115, 118-120, 183,
205, 209-210, 220, 230, 255, 258, 262, 266-267, 270, 272-276, 278,
287, 312-318, 325, 330, 333-340, 342-344, 348, 351, 357, 363, 385,
28, 31, 36, 38, 40, 72, 74-75, 89, 91, 98-99, 102, 115-116, 118-120,
176, 183, 205, 209-210, 220, 230, 234, 255, 258, 262, 266-267, 270276, 278, 285, 287, 312-318, 325, 330, 333-340, 342-344, 348, 351,
357, 363,
28, 30, 36, 40-41, 91, 98-99, 102, 107, 115-117, 119-120, 183, 205,
207, 209-211, 220, 230, 234, 255, 258, 262, 266, 270, 272-276, 278,
287, 311, 313-318, 325, 330, 333-338, 340, 342-344, 348, 351, 355,
357, 363, 418,
28, 36, 38, 40, 90, 98-99, 102, 107, 115, 117, 119-120, 171, 183, 205,
207-211, 213, 220, 230, 234, 251, 255, 258, 262, 266-268, 270-276,
278, 285, 287, 314-318, 325, 330, 333, 337-338, 340-344, 348, 351352, 355, 357-358, 363, 385,
28, 36, 38, 40, 72, 74-75, 89, 91, 98-99, 101-102, 107, 115-116, 118120, 176, 183, 205, 207, 209-210, 213, 220, 230, 234, 255, 262, 267,
272-276, 278, 287, 312-317, 325, 330, 333-334, 336-344, 348, 351,
355, 357-358, 363, 385,
28, 35, 41, 99, 102, 104, 107, 119, 123, 176, 184, 205, 210, 251, 255,
258, 261-262, 267, 270-271, 275-276, 285, 287, 311, 313-318, 325,
330, 333-340, 342-344, 348, 351, 358, 363, 385, 387,
38, 99, 101-102, 104, 119, 122, 176, 184, 205, 210, 239, 251, 255,
258, 261-262, 267, 273-276, 285, 287, 311, 313-318, 325, 330, 333340, 342-344, 348, 351, 355, 357-358, 363, 385, 387, 462,
6, 25, 28, 77, 86, 96, 98-101, 117, 119-120, 175-176, 183, 204, 208209, 229, 238, 257, 260-261, 264, 266-267, 270-275, 277, 279, 284,
286, 296, 299, 310, 313-317, 324, 329, 331-343, 347, 350-351, 354,
356, 362, 376, 384, 386,
6, 25, 28, 86, 96-98, 100-101, 105-106, 116, 118-120, 170, 175, 204,
208-209, 219, 234, 260-261, 266-267, 269-275, 277, 284, 286, 310,
313-314, 316-317, 324, 329, 334-339, 341-343, 347, 354, 356, 362,
384, 386,
74
13
,113,114,115,1
16,146
5
0
66
12
,151,383
2
0
61
11
,118,383
2
0
60
11
,113,115,116,3
83
4
0
69
13
0
0
61
11
0
0
65
12
,115,116,118,1
19,209,210
6
1
61
11
,28,30,209,210
,211,418
6
1
65
12
11
2
65
12
,28,115,117,11
9,120,207,208,
209,210,211,3
85
,115,116,118,1
19,209,210,21
3,385
8
1
51
9
,28,119,385,38
7
4
0
53
10
,119,385,387,4
62
4
0
69
13
,25,376
2
0
58
11
,384,386
2
0
B. Rhabdoviridae Nucleoproteins
Sequence
Disordered Regions
FLAV
1-5, 116-124, 167-175,
352-386, 426-436,
438-439
10-16, 38-45, 282-284,
349-381,
71
SCRV
1, 16-33, 351-378,
405-421, 423, 429-429
66
ISFV
1, 17-20, 119-130,
317-322, 366-370,
423-423
1, 19-20, 28-29, 117128, 266-267, 352372, 422-422
1, 16-27, 112, 114122, 315-323, 343351, 353, 359-365,
397-409,
1-2, 13-21, 116-128,
317-320, 360-371,
422-422
1, 15-21, 121, 261263, 265, 319-320,
356-369, 392-396,
416-422
1, 15-21, 121, 261263, 265, 319-320,
356-369, 392-396,
416-422
2, 37-46, 104-108,
273-274, 276, 391406, 409-423, 443,
445-450
1-2, 103-109, 124-134,
273-274, 276, 378401, 411-429, 443-450
1-2, 127-128, 371-403,
450-450
29
BEFV
CHPV
SVCV
VSNJV
VSIV
VSSJV
ABLV
RABV
MOKV
# Disordered Residues
% of Sequence
Disordered
51
41
62
CICP Regions
16 55, 90, 95, 102-103, 106-107, 109, 139, 214, 216-218,
220, 224, 230, 273-274, 276-279, 296-297, 315, 326,
333, 337, 344-345, 394,
11 4, 73, 95, 97, 103, 106-107, 117, 134, 193, 214, 216,
219-221, 227, 237, 276-279, 295, 297, 315, 333, 386389, 392, 395,
15 11, 43, 94, 106, 110-111, 113, 121, 138, 143, 147-148,
156, 176, 178, 197, 205, 218, 220-224, 228-229, 234,
264, 277-283, 299-301, 315, 319, 330, 347, 386,
6 57, 91, 96, 98, 103, 105-106, 111, 146, 176, 215, 217221, 227, 231, 233, 278-279, 281-283, 285, 293-294,
300-301, 336, 346-347, 349, 377, 379, 381-382, 418,
9 57, 92, 97, 104, 108-109, 136, 141, 145-146, 197-198,
217, 219-224, 228, 233, 276, 279-281, 314, 318, 320,
345, 348, 352, 377-378,
14 54, 57, 89, 94, 101, 105-106, 108, 133, 138, 142-143,
151, 214, 216, 220, 224-225, 228-230, 273, 276-278,
281, 311, 315, 373, 375, 378,
#
% of Sequence Disordered
CICPs CICPs
and CICP
31
7
# Both % Both
0
0
31
7
0
0
42
9
0
0
38
8
0
0
33
7 ,352
1
0
31
7 ,315
1
0
41
9 56, 91, 96, 103, 107-108, 118, 135, 140, 144-145, 153,
216, 218, 220, 222, 232, 275, 278-281, 298, 317, 328,
25
5 ,118,317
2
0
41
9 56, 91, 96, 103, 108, 140, 144-145, 184, 216, 218, 220,
222, 232, 275, 278-281, 298, 317, 328,
22
5
0
0
41
9 56, 91, 96, 103, 108, 140, 144-145, 184, 216, 218, 220,
222, 232, 275, 278-281, 298, 317, 328,
22
5
0
0
57
12 8, 10, 92, 97, 104, 108-109, 111, 227, 229-231, 233, 243,
286, 289-290, 308-310, 312-315, 324, 328-330, 332,
356, 388, 406, 416-417, 421, 425,
36
8 ,104,108,406,4
16,417,421
6
1
74
16 11, 92, 108, 111, 215, 229-234, 236-237, 240, 243, 248,
286-288, 290, 308, 313, 315, 330, 332, 355, 357, 411,
416-417, 421, 427, 431,
8 8, 22, 92, 97, 109, 146, 151, 227, 229, 231, 233, 243,
248, 286, 289-290, 308-310, 312-315, 328, 330, 358,
388, 406, 417, 421, 429,
33
7 ,108,411,416,4
17,421,427
6
1
31
6 ,388
1
0
38
131
NCMV
LNYV
SYNV
MFSV
RYSV
MMV
TVCV
SNAKV
VHSV
HIRV
IHNV
1-11, 315-316, 318320, 367-375, 410428, 430-431
1-3, 6-7, 19-66, 121143, 148, 150, 165173, 193-199, 452,
454-456, 458-459
1-8, 17-34, 122-133,
139-155, 419-463,
465-471, 473-475
33, 117-137, 142-151,
314-316, 373, 375,
409-422, 424-460,
462-462
26-35, 101-150, 199202, 354-377, 396508,
1, 25-37, 123-146,
345-355, 397-447
1, 30-33, 114-139,
347-348, 350, 394403, 409, 411-416,
421-447, 467-482,
495,
19-30, 34, 99-108,
160-166, 341-349,
351-352, 360-399
1-2, 18-27, 345-404
1-2, 12-24, 102-115,
311-318, 344-392
1-3, 13-31, 98-112,
315, 317-324, 342-391
46
10
0
0
0
0
100
21
0
0
0
0
110
23
0
0
0
0
89
19
0
0
0
0
201
38
0
0
0
0
100
22
0
0
0
0
95
18
0
0
0
0
81
20
0
0
0
0
72
86
17
21
0
0
0
0
0
0
0
0
96
24
0
0
0
0
C. Filoviridae Nucleoproteins
Sequence
Disordered Regions
# Disordered Residues
% of Sequence
Disordered
MARV
1-13, 312-320, 333, 336-353,
393-403, 412-624, 629-630,
648-668, 670-673, 675-685,
695-695
1-3, 120-125, 128, 131-145,
265-266, 330-339, 354, 356,
358-366, 408-473, 483-644,
2-3, 5, 117-123, 125, 128-145,
330-338, 354, 356, 358-368,
411-644, 683,
1-12, 109-112, 117-125, 132145, 262-269, 330-339, 354,
358-367, 413-474, 476-650,
683-684, 687, 697-701, 703-708,
304
43
0
% of
Sequence
CICPs
0
276
37
0
286
38
319
43
REBOV
SEBOV
ZEBOV
CICP Regions
#
CICPs
Disordered # Both
and CICP
% Both
0
0
0
0
0
0
0
0
0
0
0
0
0
D. Bornaviridae Nucleoprotein
Sequence
BDV
Disordered
Regions
1-25, 41-51, 98107, 319-354,
# Disordered
Residues
% of Sequence
Disordered
82
22
CICP Regions
# CICPs
% of Sequence
CICPs
0
0
Disordered and
CICP
# Both
0
% Both
132
APPENDIX B
SUPPLEMENTARY FIGURES FOR CHAPTER 3
!"
$"
($"
$!$"
$($"
%!$"
%($"
&!$"
&($"
'!$"
'($"
(!$"
(($"
)!$"
)($"
*!$"
*($"
+!$"
+($"
,!$"
,($"
$!!$"
$!($"
$$!$"
$$($"
$%!$"
$%($"
$&!$"
$&($"
$'!$"
$'($"
$(!$"
$(($"
$)!$"
$)($"
$*!$"
$*($"
$+!$"
$+($"
$,!$"
$,($"
%!!$"
%!($"
%$!$"
%$($"
%%!$"
%%($"
%&!$"
%&($"
%'!$"
%'($"
%(!$"
%(($"
%)!$"
%)($"
%*!$"
%*($"
%+!$"
%+($"
%,!$"
%,($"
&!!$"
!"
I
I
II
II
III
III
IV
IV
V
V
VI
!"
I
II
III
IV
V
$)!$"
!#$"
$(($"
!#%"
!#$"
$(!$"
!#&"
!#%"
$'($"
!#'"
!#&"
$'!$"
!#("
!#'"
$&($"
!#)"
!#("
V
$&!$"
!#*"
!#)"
$%($"
!#+"
!#*"
$%!$"
!#+"
$$($"
!#,"
IV
$$!$"
$"
!#,"
III
,($"
$"
II
$!($"
I
,!$"
C
!"
$!!$"
VI
+($"
V
+!$"
IV
*($"
III
*!$"
II
)($"
I
)!$"
!#$"
(($"
!#%"
!#$"
(!$"
!#&"
!#%"
'($"
!#'"
!#&"
'!$"
!#("
!#'"
&($"
!#)"
!#("
&!$"
!#*"
!#)"
%($"
!#+"
!#*"
%!$"
!#+"
$($"
!#,"
$"
($"
$!$"
$($"
%!$"
%($"
&!$"
&($"
'!$"
'($"
(!$"
(($"
)!$"
)($"
*!$"
*($"
+!$"
+($"
,!$"
,($"
$!!$"
$!($"
$$!$"
$$($"
$%!$"
$%($"
$&!$"
$&($"
$'!$"
$'($"
$(!$"
$(($"
$)!$"
$)($"
$*!$"
$*($"
$+!$"
$+($"
$,!$"
$,($"
%!!$"
%!($"
%$!$"
%$($"
%%!$"
%%($"
%&!$"
%&($"
%'!$"
%'($"
%(!$"
%(($"
%)!$"
%)($"
$"
!#,"
$"
$"
($"
A
$!$"
$"
($"
$!$"
$($"
%!$"
%($"
&!$"
&($"
'!$"
'($"
(!$"
(($"
)!$"
)($"
*!$"
*($"
+!$"
+($"
,!$"
,($"
$!!$"
$!($"
$$!$"
$$($"
$%!$"
$%($"
$&!$"
$&($"
$'!$"
$'($"
$(!$"
$(($"
$)!$"
$)($"
$*!$"
$*($"
$+!$"
$+($"
$,!$"
$,($"
%!!$"
%!($"
%$!$"
%$($"
%%!$"
%%($"
%&!$"
%&($"
%'!$"
%'($"
%(!$"
%(($"
%)!$"
!"
$"
($"
$!$"
$($"
%!$"
%($"
&!$"
&($"
'!$"
'($"
(!$"
(($"
)!$"
)($"
*!$"
*($"
+!$"
+($"
,!$"
,($"
$!!$"
$!($"
$$!$"
$$($"
$%!$"
$%($"
$&!$"
$&($"
$'!$"
$'($"
$(!$"
$(($"
$)!$"
$)($"
$*!$"
$*($"
$+!$"
$+($"
$,!$"
$,($"
%!!$"
%!($"
%$!$"
%$($"
%%!$"
%%($"
%&!$"
%&($"
133
Supporting Information For Chapter 3
B
VI
D
E
VI
!#,"
$"
!#+"
!#*"
!#)"
!#("
!#'"
!#&"
!#%"
!#$"
VI
Figure S31. Disorder Consensus Alignment Consensus Graphs For 63 L polymerase
sequences. A.) Paramyxoviridae B.) Rhabdoviradae C.) Filoviridae E.) Bornaviridae
D.) the entire Order. Graphs A, B, C, D and E represent L disorder results of figure 1.The
number of disordered residues occurring for a position of the analyzed MSA was
summed and divided by the total number of sequences that could participate in the
disorder study from that alignment. The y-axis is the percentage of residues predicted to
be a disordered and the x-axis is the residues position in the MSA. The disorder
percentages are plotted in blue. The boxes below the graphs correspond to the conserved
domains: I (green), II (blue), III (orange), IV (red), V (yellow) and VI (purple).
134
Sup Figure 1
B
A
$"
$"
!#,"
!#,"
!#+"
!#+"
!#*"
!#*"
!#)"
!#)"
!#("
!#("
!#'"
!#'"
!#&"
!#&"
!#%"
!#%"
!#$"
!#$"
!"
!"
$"
%$"
'$"
)$"
+$"
$!$"
$%$"
$'$"
$)$"
$+$"
%!$"
%%$"
%'$"
%)$"
%+$"
&!$"
&%$"
L-binding
domain
Oligomerization domain
&'$"
&)$"
&+$"
$"
'!$"
C
%$"
'$"
)$"
+$"
N0
N-RNA binding
domain
$!$"
$%$"
$'$"
$)$"
L-binding domain
$+$"
%!$"
%%$"
%'$"
%)$"
%+$"
&!$"
Oligomerization domain
&%$"
&'$"
&)$"
&+$"
'!$"
'%$"
N-RNA binding domain
L
D
$"
$"
!#,"
!#,"
!#+"
!#+"
!#*"
!#*"
!#)"
!#)"
!#("
!#("
!#'"
!#'"
!#&"
!#&"
!#%"
!#%"
!#$"
!#$"
!"
!"
$"
%$"
'$"
)$"
+$"
$!$"
$%$"
$'$"
$)$"
$+$"
%!$"
%%$"
%'$"
%)$"
%+$"
&!$"
&%$"
&'$"
$"
%$"
'$"
)$"
+$"
$!$"
$%$"
$'$"
$)$"
$+$"
%!$"
Interferon inhibitory domain
Oligomerization domain
E
$"
!#,"
!#+"
!#*"
!#)"
!#("
!#'"
!#&"
!#%"
!#$"
!"
$"
%$"
'$"
Paramyxoviridae
Rhabdoviridae
)$"
+$"
$!$" $%$" $'$" $)$" $+$" %!$" %%$" %'$" %)$" %+$" &!$" &%$" &'$" &)$" &+$" '!$" '%$" ''$" ')$"
L-binding
domain
Oligomerization domain
L-binding
domain
N0
Oligomerization
domain
N-RNA binding
domain
N-RNA binding domain
L
Filoviridae
Bornaviridae
Interferon inhibitory domain
Oligomerization domain
X binding
Oligomerization
N
bindi
ng
Figure S3.2. Disorder Consensus Alignment Consensus Graphs For 63 P sequences. A.)
Paramyxoviridae B.) Rhabdoviradae C.) Filoviridae E.) Bornaviridae D.) the entire
Order. Graphs A, B, C, D and E represent P disorder results of figure 2. The number of
Disordered residues occurring for a position of the analyzed MSA was summed and
divided by the total number of sequences that could participate in the disorder study from
that alignment. The y-axis is the percentage of residues predicted to be a Disordered and
the x-axis is the residues position in the MSA. The Disorder percentages are plotted in
blue. The boxes below the graphs correspond to the different binding domains:
oligomerization (green), N0 binding domain (blue), N-RNA binding domain (red), L
binding domain (yellow), X binding domain (orange), and interferon inhibitory domain
(purple). In E, all the family binding domains are shown.
135
A
$"
!#,"
!#+"
!#*"
!#)"
!#("
!#'"
!#&"
!#%"
!"
$"
($"
$!$"
$($"
%!$"
%($"
&!$"
&($"
'!$"
'($"
(!$"
(($"
)!$"
)($"
*!$"
*($"
+!$"
+($"
,!$"
,($"
$!!$"
$!($"
$$!$"
$$($"
$%!$"
$%($"
$&!$"
$&($"
$'!$"
$'($"
$(!$"
$(($"
$)!$"
$)($"
$*!$"
$*($"
$+!$"
$+($"
$,!$"
$,($"
%!!$"
%!($"
%$!$"
%$($"
%%!$"
%%($"
%&!$"
%&($"
%'!$"
%'($"
%(!$"
%(($"
%)!$"
!#$"
I
II
III
IV
V
VI
B
$"
!#,"
!#+"
!#*"
!#)"
!#("
!#'"
!#&"
!#%"
!"
$"
($"
$!$"
$($"
%!$"
%($"
&!$"
&($"
'!$"
'($"
(!$"
(($"
)!$"
)($"
*!$"
*($"
+!$"
+($"
,!$"
,($"
$!!$"
$!($"
$$!$"
$$($"
$%!$"
$%($"
$&!$"
$&($"
$'!$"
$'($"
$(!$"
$(($"
$)!$"
$)($"
$*!$"
$*($"
$+!$"
$+($"
$,!$"
$,($"
%!!$"
%!($"
%$!$"
%$($"
%%!$"
%%($"
%&!$"
%&($"
%'!$"
%'($"
%(!$"
%(($"
%)!$"
%)($"
!#$"
I
II
III
IV
V
VI
C
$"
!#,"
!#+"
!#*"
!#)"
!#("
!#'"
!#&"
!#%"
!"
$"
($"
$!$"
$($"
%!$"
%($"
&!$"
&($"
'!$"
'($"
(!$"
(($"
)!$"
)($"
*!$"
*($"
+!$"
+($"
,!$"
,($"
$!!$"
$!($"
$$!$"
$$($"
$%!$"
$%($"
$&!$"
$&($"
$'!$"
$'($"
$(!$"
$(($"
$)!$"
$)($"
$*!$"
$*($"
$+!$"
$+($"
$,!$"
$,($"
%!!$"
%!($"
%$!$"
%$($"
%%!$"
%%($"
%&!$"
%&($"
%'!$"
%'($"
%(!$"
%(($"
%)!$"
%)($"
%*!$"
%*($"
%+!$"
%+($"
%,!$"
%,($"
&!!$"
!#$"
I
II
III
IV
V
VI
Figure S3.3. CICP Consensus Alignment Consensus Graphs For 37 L polymerase
sequences. A.) Paramyxoviridae B.) Rhabdoviradae C.) the entire Order. Graphs A, B
and C represent L CICP results of figure 1A, 1B and 1E. The number of CICPs occurring
for a position of the analyzed MSA was summed and divided by the total number of
sequences that could participate in the CICP study from that alignment (Paramyxoviridae
L, 25 sequences; Rhabdoviradae L 12 sequences; and the entire Order L; 37 sequences).
The y-axis is the percentage of residues predicted to be a CICP and the x-axis is the
residues position in the MSA. The CICP percentages are plotted in blue. The boxes below
the graphs correspond to the conserved domains: I (green), II (blue), III (orange), IV
(red), V (yellow) and VI (purple).
136
$"
!#,"
!#+"
!#*"
!#)"
!#("
!#'"
!#&"
!#%"
!#$"
!"
$"
%$"
'$"
)$"
+$"
$!$"
$%$"
$'$"
Oligomerization domain
$)$"
$+$"
%!$"
%%$"
%'$"
L-binding
domain
%)$"
%+$"
&!$"
&%$"
&'$"
&)$"
&+$"
'!$"
N-RNA binding
domain
Figure S3.4. CICP Consensus Alignment Consensus Graphs For 15 Paramyxovirinae P
sequences. This graph represents the CICP results from figure 2A. The number of CICPs
occurring for a position of the analyzed MSA was summed and divided by the total
number of sequences that could participate in the CICP study from that alignment. The
y-axis is the percentage of residues predicted to be a CICP and the x-axis is the residues
position in the MSA. The CICP percentages are plotted in blue. The boxes below the
graphs correspond to the different binding domains: oligomerization Domain (green), NRNA binding Domain (red), and L binding Domain (yellow).
137
APPENDIX C
SUPPLEMENTARY TABLE 3.1
138
Supplementary Table 3.1 List of predicted Disordered and CICP residues for each viruses
L protein. The numbers in the Disorder Regions and CICP Regions columns correspond
to the unaligned residue position(s) of each sequence. A.) Paramyxoviridae B.)
Rhabdoviridae C.) Filoviridae D.). Bornaviridae. The table columns are: Name - the
abbreviated name or the virus (see Methods), CICP positions – the location of the CICP
residues corresponding to the sequence position, CICP # - the total number of CICPS for
the sequence, CICP % - the percentage of CICP positive residues in the sequence,
Disorder Positions – the location of the disordered residues corresponding to sequence
position, Disorder # - the total number of disordered amino acids in the sequence,
Disorder % - the percentage of disordered residues in the sequence, Both Positions –
residue position that are positive for both CICP and disorder in the sequence, Both # - the
total number of residues that are both disordered and a CICP in the sequence, Both % the percentage of residues that are both disordered and a CICP in the sequence.
A. Paramyxoviridae L Polymerase
Name
AVPMV6
AVPNV
BEIV
CICP Position
CICP CICP %
Disorder
Disorder
#
Position
#
["8", "10-11", "14", "17", "20", "22-23", "25", "27", "29-32", "34", "36", "42", "44-45", "48", "52", "59", "64", "91", "95", 828 36.95% ["1-2", "4-5",
49
"107", "117", "124-125", "164", "171", "174", "185", "189", "191", "193-199", "204", "225", "227-229", "232", "234", "243"633-655",
245", "248", "254", "256", "262-264", "266-267", "270", "272-273", "283-284", "287", "290-291", "296", "298-299", "301"721-725",
302", "304-305", "307-308", "310", "313", "317-318", "322", "325", "327", "329-330", "338", "358", "362-365", "367",
"789-793",
"369-373", "375-376", "382-383", "389", "393", "395-396", "400", "402-411", "413", "415-416", "420", "422-425", "427",
"1030-1038",
"430", "433-434", "436-438", "444", "447-453", "456", "459-465", "467", "470-479", "481-484", "489-493", "495", "497"1854-1855",
501", "505", "513", "517", "519-520", "523", "525", "529", "533", "536-537", "539", "541-542", "544", "546", "552-554",
"2240"]
"563", "565-566", "569", "573", "575-577", "579-580", "583-584", "586-588", "590-593", "597-598", "601-609", "611-615",
"617-618", "621-623", "657-661", "664", "669", "671", "674", "677-682", "684-691", "693-696", "698", "700", "702",
"705", "709-710", "712", "715-716", "718", "723", "726-727", "731", "733-740", "746-747", "750", "753-755", "757-760",
"762", "764-768", "770-773", "780", "785-786", "789", "796-797", "800", "804", "807", "812", "814-815", "824-825",
"829", "831-833", "839-843", "846-848", "850", "853-873", "875-878", "880-883", "885", "888-889", "892-893", "895-896",
"899-902", "904", "906-908", "911-913", "915-917", "920", "923", "925", "927", "930-932", "934-935", "937", "940-941",
"945-946", "948-949", "952-954", "961-965", "967", "971-972", "976", "984-988", "993", "996-1000", "1004-1005",
"1007", "1009-1015", "1017", "1019-1023", "1025-1027", "1029", "1033", "1037-1039", "1042", "1046", "1054-1056",
"1059-1060", "1066-1067", "1070-1073", "1075-1077", "1080", "1082-1083", "1087-1091", "1093", "1096-1098", "1100",
"1103", "1106-1108", "1111", "1114-1116", "1119", "1138", "1140", "1147-1150", "1152-1154", "1157-1158", "1163",
"1168", "1171-1173", "1176", "1179-1180", "1185", "1187-1188", "1201-1202", "1205", "1209-1210", "1212-1213",
"1215", "1219-1222", "1224", "1227", "1230", "1232", "1235-1240", "1245", "1247-1249", "1253", "1255-1260", "1263",
"1269", "1272", "1277", "1279", "1281", "1283", "1287", "1289", "1291-1292", "1294-1295", "1297-1298", "1300-1301",
"1305", "1307-1308", "1311-1312", "1314-1316", "1318", "1320-1321", "1323", "1326-1327", "1331-1334", "1340", "13421343", "1352", "1357-1358", "1362-1363", "1368-1369", "1372", "1374-1375", "1379", "1385-1388", "1390", "1394",
"1396", "1408", "1411-1412", "1418", "1422", "1429", "1439", "1442", "1445-1446", "1450", "1453-1455", "1457", "14591461", "1464", "1466-1469", "1473", "1476-1479", "1481-1485", "1487", "1489-1494", "1502-1504", "1506-1509", "15121518", "1524", "1526-1527", "1532", "1534-1537", "1540", "1542", "1545-1548", "1551-1554", "1556", "1558", "1560",
"1562", "1564", "1566-1568", "1570-1571", "1573-1574", "1579", "1581-1583", "1586", "1590-1591", "1594", "1597",
"1602", "1604", "1606-1607", "1609", "1612", "1615", "1619-1621", "1623-1625", "1627", "1640", "1647-1649", "1676",
"1680", "1682", "1688", "1690", "1695-1698", "1700", "1707", "1767", "1781-1782", "1786-1787", "1790", "1792", "17941795", "1797", "1799-1800", "1804", "1812", "1814", "1817-1818", "1824-1825", "1828-1829", "1833", "1838", "18421844", "1848", "1850", "1853-1854", "1859-1861", "1865-1866", "1886", "1888-1892", "1894-1899", "1901", "1903",
"1908", "1910", "1914", "1916", "1918-1920", "1928", "1933-1934", "1936", "1939", "1941-1942", "1947", "1952", "1960",
"1977", "1980", "1985", "1987-1990", "1992-1993", "1996-1997", "2010", "2013", "2032", "2036", "2066-2067", "20702071", "2075", "2079", "2089", "2106", "2110-2111", "2121", "2126", "2130-2132", "2136", "2139-2140", "2172-2173",
"2175", "2180", "2195-2196", "2220-2222", "2224", "2228-2233"]
[]
0
0.00% ["1-5", "4992
52", "321",
"613-620",
"743-762",
"981-991",
"1172-1195",
"1601",
"1604-1605",
"1608-1611",
"1639-1643",
"1752-1756",
"2002",
"2004"]
["9-12", "18", "20-21", "24", "28", "30-31", "43", "86-87", "90", "93", "120", "159", "162", "164", "174-175", "178-179",
632 29.10% ["0-8", "36481
"211-213", "224", "229", "231", "233", "235", "237", "239-240", "243", "245", "251", "253-254", "260", "268", "272", "275365", "492277", "279", "285-286", "288-289", "291", "294-296", "298", "303", "305-306", "308", "310", "314-315", "319", "341",
496", "599"346-348", "352-354", "358-359", "364-366", "370", "372", "374", "376-377", "383", "385", "388-389", "394-396", "399",
626", "1074"401", "404-405", "408-410", "416", "419", "426-427", "431-433", "436", "439-440", "443", "445-448", "453-455", "4591075", "1187461", "464", "466-468", "472-473", "475", "478-480", "482-483", "499", "501-502", "504-505", "508", "512-513", "517",
1188", "1194"519-520", "523", "525", "528", "531-533", "535-537", "540", "542-543", "548", "551-556", "558-559", "561", "563", "5691199", "1202572", "578-582", "584-585", "588", "590-591", "593-594", "596-597", "599-601", "634-636", "640", "642", "648-652",
1206", "1263"655-660", "663-668", "671-673", "679", "681-682", "686-691", "693", "704", "711-714", "716-717", "723-724", "731-732",
1267", "1274"734", "736-739", "741-743", "745-746", "748-752", "757", "759", "761", "763", "773-774", "784-785", "789", "791-792",
1275", "1277"805-806", "808-813", "818-819", "822-824", "826", "828", "832-840", "842-851", "853-855", "857-861", "868-870", "872",
1285", "1869"876", "880-881", "883-884", "887-889", "894", "914", "916", "918", "921-926", "929", "931", "933-935", "937", "940",
1872", "2170"942-943", "946-947", "954", "959", "961-965", "967-968", "970", "972", "974-977", "980-984", "988-992", "995-997",
2171"]
"999-1000", "1006", "1009-1011", "1013-1015", "1027", "1029-1031", "1033", "1035-1036", "1039-1043", "1045", "10481054", "1057-1059", "1061-1062", "1064-1066", "1068", "1074-1075", "1080", "1084", "1087-1088", "1091-1093", "1113-
Disorder Both Both # Both %
%
Position
2.19% ["723",
6
0.27%
"789",
"1033",
"10371038",
"1854"]
4.59%
[]
0
0.00%
3.73%
["364365",
"599601",
"10741075",
"12021205",
"12631264",
"1267",
"12741275",
"12771279",
24
1.10%
139
BPV3
BRSV
CDV
DMV
1115", "1118", "1125", "1133", "1135", "1138-1140", "1142", "1145-1146", "1149", "1152", "1164", "1189-1191", "1193",
"1201-1205", "1207", "1209", "1216", "1218", "1220-1221", "1225-1227", "1229", "1231-1233", "1236-1237", "12391240", "1246-1248", "1255", "1258-1259", "1261-1264", "1267-1268", "1272-1279", "1281-1283", "1285", "1290", "1294",
"1296-1297", "1301", "1307", "1310-1311", "1314", "1316", "1318-1319", "1326", "1330", "1339", "1341-1346", "1361",
"1376", "1378-1379", "1381", "1384", "1403", "1405", "1410", "1424-1426", "1431", "1437-1438", "1440", "1444-1456",
"1459", "1461", "1468-1469", "1473", "1484-1485", "1488", "1494", "1503-1504", "1509-1515", "1520-1521", "1527",
"1529", "1534", "1538", "1548-1549", "1557", "1561", "1607", "1609", "1615-1616", "1641", "1643", "1646", "1649-1654",
"1658", "1738-1739", "1741", "1743", "1745", "1748", "1763", "1765-1766", "1772", "1779-1780", "1791", "1793-1794",
"1810-1812", "1832-1835", "1837", "1840-1843", "1848", "1851", "1855", "1859", "1863-1864", "1869", "1881", "1886",
"1888", "1892", "1905", "1930", "1932-1935", "1937-1938", "1955", "1958", "2010", "2014", "2017-2018", "2022", "2024",
"2065", "2070", "2075-2076", "2079", "2111-2112", "2119", "2131", "2154", "2160-2164"]
["11", "13-14", "17", "20", "23", "25-26", "28", "30", "32-35", "37", "39", "45", "47-48", "51", "55", "62", "64", "90", "94", 819 36.68% ["0-13", "65"123-124", "156", "163", "166", "168", "172", "174", "176-182", "186-187", "212-214", "217", "219", "228-230", "233",
67", "144"239", "241", "247-249", "251-252", "255", "257-258", "268-269", "272", "275-276", "281", "283-285", "287", "289-290",
147", "629"292-293", "295", "298", "302-303", "307", "310", "312", "314-315", "323", "345", "349-352", "354", "356-360", "362635", "637363", "369-370", "376", "380", "382-383", "387", "389", "391-398", "400", "402-403", "407", "409-412", "414", "417",
640", "865"420-421", "423-425", "431", "434-440", "443", "446-452", "457-466", "468-471", "476-480", "482", "484-488", "492",
866", "1024"496", "500", "502-503", "506-508", "512", "516", "520", "524-525", "527", "529", "535-537", "546", "548-549", "552",
1026", "1286"556", "558-560", "562-563", "566-567", "569-571", "573-576", "580-581", "584-592", "594-598", "600-601", "603-605",
1287", "1318",
"653-654", "656", "659", "664", "666", "669", "672-677", "679-686", "688-691", "693", "695", "697", "700", "704-705",
"1689-1697",
"707", "710-711", "713", "721-722", "726", "728-735", "741-742", "745", "748-750", "752-755", "757", "759-763", "765"1714-1722",
768", "775", "780-781", "784", "791-792", "795", "799", "802", "807", "809-810", "819-820", "824", "826-828", "834-838",
"1749",
"841-843", "845", "848-868", "870-873", "875-878", "880", "883-884", "887-888", "890-891", "894-897", "899", "901-903",
"1752-1760",
"906-908", "910-912", "915", "918", "922", "927", "929-930", "932", "935-936", "939-941", "943-944", "947-949", "956"1973",
960", "962", "966-967", "971", "979-983", "988", "991-995", "999-1000", "1002", "1004-1010", "1014-1018", "1020-1022",
"2216-2232"]
"1024", "1028", "1032-1034", "1037", "1041", "1049-1051", "1054", "1061-1062", "1065-1068", "1070-1072", "1075",
"1077-1078", "1082-1086", "1088", "1091-1093", "1095", "1098", "1101-1103", "1106", "1109-1111", "1114", "1123",
"1125", "1132-1135", "1137-1139", "1141-1143", "1148", "1153", "1156-1158", "1161-1162", "1164-1165", "1170", "11721173", "1181", "1188-1189", "1192", "1196-1197", "1199-1200", "1202", "1206-1209", "1211", "1214", "1217", "1219",
"1222-1227", "1232", "1234-1236", "1240", "1242-1247", "1250", "1256", "1259", "1264", "1266", "1268-1270", "1274",
"1276", "1278-1279", "1281-1282", "1284-1285", "1287-1288", "1292", "1294-1295", "1298-1299", "1301-1303", "1305",
"1307-1308", "1310", "1313-1314", "1318-1321", "1323", "1325", "1327", "1329-1330", "1339", "1344-1345", "13491350", "1355-1356", "1359", "1361-1362", "1366", "1372-1375", "1377", "1381", "1383", "1398-1399", "1405", "1409",
"1416", "1424", "1429", "1432-1433", "1437", "1440-1442", "1444", "1446-1448", "1451", "1453-1456", "1460", "14631466", "1468-1472", "1474", "1476-1477", "1479-1481", "1488-1490", "1493", "1495-1496", "1499-1505", "1511", "15131514", "1519", "1521-1524", "1527", "1529", "1532-1535", "1538-1541", "1543", "1545", "1547", "1549", "1551", "15531555", "1557-1558", "1560-1561", "1566", "1568-1570", "1573", "1577-1578", "1581", "1584", "1589", "1591", "15931594", "1596", "1599", "1602", "1605-1608", "1610-1612", "1614", "1627", "1634-1636", "1657", "1661", "1663", "1669",
"1671", "1673", "1675-1678", "1681", "1689", "1763", "1770-1771", "1775-1776", "1779", "1781", "1783-1784", "1786",
"1788", "1793", "1803", "1806-1807", "1813-1814", "1817-1818", "1827", "1831-1834", "1837", "1839", "1842", "18481850", "1852-1853", "1867", "1869", "1871-1875", "1877", "1879-1882", "1884", "1886", "1888", "1891", "1893", "1897",
"1899", "1901-1903", "1911", "1916", "1919", "1924-1925", "1930", "1935", "1943", "1963", "1968", "1970-1973", "19751976", "1979-1980", "1993", "1996", "2011", "2015", "2019", "2043-2044", "2047", "2052", "2056", "2066", "2071",
"2086", "2088-2090", "2100", "2105", "2109-2111", "2115", "2118-2119", "2151-2152", "2154", "2156", "2159", "21702171", "2190", "2192", "2196", "2198-2202", "2204-2205"]
[]
0
0.00% ["1-2", "4",
"6-8", "158159", "161180", "12481272", "1713",
"1729-1735",
"2160-2161"]
["7", "9-10", "13", "16", "19", "21", "24", "26", "28-31", "33", "35", "41", "43-44", "47", "51", "58", "63", "86", "90", "119- 812 37.18% ["1-3", "8",
120", "148", "155", "158", "160", "164", "166", "168-174", "179", "204-206", "209", "211", "220-222", "225", "231", "233",
"490-492",
"239-241", "243-244", "247", "249-250", "260-261", "264", "267-268", "273", "275-277", "279", "281-282", "284-285",
"593-646",
"287", "290", "294-295", "299", "302", "304", "306-307", "315", "337", "341-344", "346", "348-352", "354-355", "361"794", "1032362", "368", "372", "374-375", "379", "381", "383-390", "392", "394-395", "399", "401-404", "406", "409", "412-413",
1034", "1230"415-417", "423", "426-432", "435-436", "438-444", "449-458", "460-463", "468-472", "474", "476-480", "484", "488",
1233", "1281"492", "494-495", "498-500", "504", "508", "512", "516-517", "519", "521", "527-529", "538", "540-541", "544", "548",
1284", "1296",
"550-552", "554-555", "558-559", "561-563", "565-568", "572-573", "576-584", "586-590", "592-597", "653-654", "656",
"1696-1720",
"659", "664", "666", "669", "672-677", "679-686", "688-691", "693", "695", "697", "700", "704-705", "707", "711", "713",
"1736",
"721-722", "726", "728-735", "741-742", "745", "748-750", "752-755", "757", "759-763", "765-768", "775", "780-781",
"1815-1822",
"784", "791-792", "795", "799", "802", "807", "809-810", "819-820", "824", "826-828", "834-838", "841-843", "845",
"2052-2053",
"848", "850-868", "870-873", "875-878", "880", "883-884", "887-888", "890-891", "894-897", "899", "901-903", "906-908",
"2183"]
"910-912", "915", "918", "922", "927", "929-930", "932", "935-936", "939-941", "943-945", "947-949", "956-960", "962",
"966-967", "971", "979-983", "988", "991-995", "999-1000", "1002", "1004-1010", "1014-1018", "1020-1022", "1024",
"1028", "1032-1034", "1037", "1041", "1049-1051", "1054", "1061-1062", "1065-1068", "1070-1072", "1075", "10771078", "1082-1086", "1088", "1091-1093", "1095", "1098", "1101-1103", "1106", "1109-1110", "1114", "1123", "1125",
"1132-1135", "1137-1139", "1142-1143", "1148", "1153", "1156-1158", "1161-1162", "1164-1165", "1170", "1172-1173",
"1182", "1186-1187", "1190", "1194-1195", "1197", "1200", "1204-1207", "1209", "1212", "1215", "1217", "1220-1225",
"1230", "1232-1234", "1238", "1240-1245", "1248", "1254", "1257", "1262", "1264", "1266-1268", "1271-1272", "1274",
"1276-1277", "1279-1280", "1282-1283", "1286", "1290", "1292-1293", "1296-1297", "1299-1301", "1303", "1305-1306",
"1308", "1311-1312", "1316-1319", "1323", "1325", "1327-1328", "1337", "1342-1343", "1347-1348", "1353-1354",
"1357", "1359-1360", "1364", "1370-1373", "1375", "1379", "1381", "1396-1397", "1403", "1407", "1414", "1422", "1427",
"1430-1431", "1435", "1438-1440", "1442", "1444-1446", "1449", "1451-1454", "1456", "1458", "1461-1464", "14661470", "1472", "1474-1475", "1477-1479", "1487", "1491", "1493-1494", "1497-1503", "1509", "1511-1512", "1517",
"1519-1522", "1525", "1527", "1530-1533", "1536-1539", "1541", "1543", "1545", "1547", "1549", "1551-1553", "15551556", "1558-1559", "1564", "1566-1568", "1571", "1575-1576", "1579", "1582", "1589", "1591", "1593-1594", "1596",
"1599", "1602", "1605-1608", "1610-1612", "1614", "1627", "1634-1636", "1657", "1661", "1663", "1669", "1671", "1673",
"1675-1678", "1681", "1688", "1739", "1750-1751", "1755-1756", "1759", "1761", "1763-1764", "1766", "1768", "1773",
"1781", "1783", "1786-1787", "1793-1794", "1797-1798", "1807", "1811-1814", "1817", "1819", "1822", "1828-1830",
"1832-1833", "1843", "1845", "1847-1851", "1853-1858", "1862", "1867", "1869", "1873", "1875", "1877-1879", "1887",
"1892", "1895", "1898", "1900-1901", "1906", "1911", "1919", "1939", "1944", "1946-1949", "1951-1952", "1955-1956",
"1969", "1972", "1987", "1991", "1995", "2023-2024", "2027", "2032", "2036", "2046", "2051", "2060", "2065", "2079",
"2083-2085", "2089", "2092-2093", "2123-2124", "2126", "2128", "2131", "2142-2143", "2164", "2166", "2168", "2170",
"2172-2177", "2179"]
["7", "9-10", "13", "16", "19", "21", "24", "26", "28-31", "33", "35", "41", "43-44", "47", "51", "58", "63", "86", "90", "119- 798 36.56% ["1-3", "5-6",
120", "148", "155", "158", "160", "164", "166", "168-174", "179", "204-206", "209", "211", "220-222", "225", "231", "233",
"487-496",
"239-241", "243-244", "247", "249-250", "260-261", "264", "267-268", "273", "275-276", "279", "281-282", "284-285",
"595-623",
"287", "290", "294", "299", "302", "304", "306-307", "315", "337", "341-344", "346", "348-352", "354-355", "361-362",
"625-632",
"372", "374-375", "379", "381", "383-390", "392", "394-395", "399", "401-404", "406", "409", "412-413", "415-417",
"1228-1233",
"423", "426-432", "435-436", "438-444", "449-458", "460-463", "468-472", "474", "476-480", "484", "488", "492", "494"1281-1284",
495", "498", "500", "504", "508", "512", "516-517", "519", "521", "527-529", "538", "540-541", "544", "548", "550-552",
"1291",
"554-555", "558-559", "561-563", "565-568", "572-573", "576-584", "586-590", "592-593", "595-597", "653-656", "659",
"1379-1386",
"664", "666", "669", "672-677", "679-686", "688-691", "693", "695", "697", "700", "704-705", "707", "711", "713", "721"1815-1827",
722", "726", "728-735", "741-742", "745", "748-750", "752-755", "757", "759-763", "765-768", "775", "780-781", "784",
"2050-2051",
"791-792", "795", "799", "802", "807", "809-810", "819-820", "824", "826-828", "834-838", "841", "843", "845", "848",
"2179",
"850-868", "870-873", "875-878", "880", "883-884", "887-888", "890-891", "894-897", "899", "901-903", "906-908", "910"2182"]
912", "915", "918", "922", "927", "929-930", "932", "935-936", "940-941", "943-944", "947-949", "956-960", "962", "966967", "971", "979-983", "988", "991-995", "999-1000", "1002", "1005-1010", "1012", "1014-1018", "1020-1022", "1024",
"1028", "1032-1034", "1037", "1041", "1049-1051", "1054", "1061-1062", "1065-1068", "1070-1072", "1075", "10771078", "1082-1086", "1088", "1091-1093", "1095", "1098", "1101-1103", "1106", "1109-1110", "1114", "1123", "1125",
"1132-1135", "1137-1139", "1141-1143", "1148", "1153", "1156-1158", "1161", "1164-1165", "1170", "1172-1173", "11861187", "1190", "1194-1195", "1197-1198", "1200", "1204-1207", "1209", "1212", "1215", "1217", "1220-1224", "1230",
"1232-1234", "1238", "1240-1245", "1248", "1254", "1257", "1262", "1264", "1266-1268", "1272", "1274", "1276-1277",
"1279-1280", "1282-1283", "1285-1286", "1290", "1292-1293", "1296-1297", "1299-1301", "1303", "1305-1306", "1308",
"1311-1312", "1316-1319", "1323", "1325", "1327-1328", "1337", "1342-1343", "1347-1348", "1353-1354", "1357", "1359-
"12811283",
"1285",
"1869"]
86
3.85%
["11",
"13",
"865866",
"1024",
"1287",
"1318",
"1689",
"1973"]
9
0.40%
63
2.91%
[]
0
0.00%
111
5.08%
["492",
"593597",
"10321034",
"1230",
"12321233",
"12821283",
"1296",
"1817",
"1819",
"1822"]
18
0.82%
88
4.03%
["488",
"492",
"494495",
"595597",
"1230",
"12321233",
"12821283",
"1379",
"1381",
"1817",
"1819",
"1822",
"2051",
"2179"]
19
0.87%
140
FDLV
GPV
HMPNV
HPIV2
1360", "1364", "1370-1373", "1375", "1379", "1381", "1396-1397", "1403", "1407", "1414", "1422", "1427", "1430-1431",
"1435", "1438-1440", "1442", "1444-1446", "1449", "1451-1454", "1456", "1458", "1461-1464", "1466-1470", "1472",
"1474-1475", "1477-1479", "1487", "1491", "1493-1494", "1497-1503", "1509", "1511-1512", "1517", "1519-1522",
"1525", "1527", "1530-1533", "1536-1539", "1541", "1543", "1545", "1547", "1549", "1551-1553", "1555-1556", "15581559", "1564", "1566-1568", "1571", "1575-1576", "1582", "1589", "1591", "1593-1594", "1596", "1599", "1602", "16061608", "1610-1612", "1614", "1627", "1634-1636", "1657", "1661", "1663", "1667", "1669", "1671", "1673", "1675-1678",
"1681", "1688", "1739", "1750-1751", "1755-1756", "1759", "1761", "1763-1764", "1766", "1768", "1773", "1783", "17861787", "1793-1794", "1797-1798", "1807", "1811-1813", "1817", "1819", "1822", "1828-1830", "1832-1833", "1843",
"1845", "1847-1851", "1853", "1855-1858", "1862", "1864", "1867", "1869", "1873", "1875", "1877-1878", "1887", "1892",
"1895", "1900-1901", "1906", "1911", "1919", "1939", "1944", "1946-1948", "1951-1952", "1955-1956", "1969", "1972",
"1991", "1995", "2023-2024", "2027", "2032", "2036", "2046", "2051", "2060", "2065", "2079", "2083-2085", "2089",
"2092-2093", "2123-2124", "2126", "2128", "2131", "2142-2143", "2149", "2164", "2166", "2168", "2170", "2172-2177",
"2179"]
["6-9", "15", "17", "23", "25", "27-28", "40", "60", "83-84", "87", "117", "156", "159", "161", "165", "169", "172", "175659 30.22% ["1-5", "51176", "179-180", "208-210", "221", "226", "230", "232", "234", "236-237", "240", "242", "244", "248", "250-251", "257",
60", "150",
"265", "269", "272", "274", "276-277", "282-283", "285-286", "288", "291-293", "295", "297", "300", "302", "307-308",
"180-184",
"311-312", "316", "338", "343-345", "349-353", "355-356", "361-363", "367", "369", "373-374", "380", "385-386", "391"189-192",
393", "396", "398", "401-402", "405-407", "410", "413-414", "416", "423-424", "429-430", "432-433", "436-437", "440"486-493",
444", "450-452", "456-458", "461-465", "469-470", "472", "475-477", "480", "493", "496", "498-499", "501-502", "505",
"593-611",
"510", "513-514", "516-518", "520", "522", "525", "528-530", "532-534", "537", "539-540", "542", "545", "548-553",
"616-646",
"555", "558", "560", "562", "564", "566-569", "575-579", "581-582", "585-587", "589-591", "593-598", "648-649", "653",
"712-713",
"655", "661-665", "668-673", "676-681", "684", "686", "692", "694-695", "700-701", "703-704", "706", "717", "723-727",
"990", "1016"729-730", "736-737", "744-745", "747", "749-752", "756", "758-765", "770", "772", "774-776", "786-787", "797-798",
1022", "1026"802", "804-805", "814-815", "818-819", "821", "823-826", "829", "831-833", "835-837", "839-841", "845-864", "866-868",
1030", "1089"870-875", "879", "882-883", "885-886", "889", "892-893", "896-897", "901-902", "905", "907", "924-925", "927", "9291090", "1149931", "934-939", "942", "944", "946-948", "950", "953-957", "960", "967", "972", "974", "976", "978", "980-981", "983",
1156", "1196"986-990", "993-997", "1001-1005", "1008-1009", "1012-1013", "1019", "1022", "1024", "1026-1029", "1040", "10421197", "12031046", "1049-1050", "1052-1058", "1061-1067", "1070-1072", "1074-1075", "1077-1081", "1083", "1087-1088", "1093",
1208", "1210"1096-1098", "1100-1101", "1104-1106", "1109", "1125-1126", "1131", "1137-1138", "1146", "1148", "1151-1153",
1221", "1276"1155", "1157-1159", "1162", "1165", "1177", "1192", "1202-1204", "1206-1207", "1212", "1214-1218", "1220", "1222",
1287", "1291",
"1229", "1232-1234", "1236", "1238-1240", "1242", "1244-1246", "1250", "1252-1253", "1259-1261", "1268", "1271"1346-1347",
1272", "1274-1277", "1280-1281", "1285-1292", "1294", "1298", "1300", "1303", "1307", "1309-1310", "1312", "1314",
"1349-1351",
"1323-1324", "1327", "1329", "1331-1333", "1337", "1339", "1343", "1352", "1354-1359", "1366", "1368", "1389", "1391"1376-1381",
1394", "1397", "1402", "1416", "1437-1439", "1444", "1453", "1457", "1459-1464", "1466", "1468-1469", "1474", "1483",
"1385-1386",
"1495", "1497-1498", "1507", "1514", "1516-1517", "1522-1528", "1533-1534", "1540", "1544", "1547", "1550-1551",
"1453",
"1553", "1559", "1562", "1570", "1574", "1620", "1628", "1632-1633", "1656", "1659-1660", "1663-1667", "1671", "1754"1623",
1756", "1758", "1762", "1765", "1772", "1780", "1782-1783", "1792", "1796-1797", "1806", "1808", "1810-1811", "1827"1726-1728",
1829", "1848-1852", "1854", "1857-1860", "1864-1865", "1868", "1872", "1876", "1878", "1880-1881", "1886", "1898",
"1730",
"1903", "1905", "1909", "1917", "1922", "1947", "1949-1952", "1954-1955", "1960", "1975", "2021", "2028-2029", "2033",
"1734-1742",
"2076", "2080-2081", "2086-2087", "2122-2123", "2130", "2142", "2163", "2169-2174"]
"1813-1831",
"2072-2085",
"2145-2146",
"2175",
"2179-2180"]
["11", "13-14", "17", "20", "23", "25", "28", "30", "32-35", "37", "39", "45", "47-48", "51", "55", "62", "67", "90", "94",
797 36.16% ["0-15", "101"123-124", "156", "163", "166", "168", "172", "174", "176-182", "186-187", "210-212", "215", "217", "226-228", "231",
107", "140"237", "239", "245-247", "249-250", "253", "255-256", "266-267", "270", "273-274", "279", "281-283", "285", "287-288",
154", "156",
"290-291", "293", "296", "300-301", "305", "308", "310", "312-313", "321", "341", "345-348", "350", "352-356", "358"612-634",
359", "365-366", "372", "376", "378-379", "383", "385", "387-394", "396", "398-399", "403", "405-408", "410", "413",
"687-690",
"416-417", "419-421", "427", "430-436", "439", "442-448", "453-462", "464-467", "472-476", "478", "480-484", "488",
"694-696",
"496", "500", "502-503", "506", "508", "512", "516", "520", "524-525", "527", "529", "535-537", "546", "548-549", "552",
"887-894",
"556", "558-560", "562-563", "566-567", "569-571", "573-576", "580-581", "584-592", "594-598", "600-601", "603-605",
"1010",
"631-632", "634", "637", "642", "644", "647", "650-655", "657-664", "666-669", "671", "673", "675", "678", "682-683",
"1013",
"685", "689", "691", "699-700", "704", "706-713", "719-720", "723", "726-728", "730-733", "735", "737-741", "743-746",
"1049-1051",
"753", "758-759", "762", "769-770", "773", "777", "780", "785", "787", "797-798", "802", "804-806", "812-816", "819"1177-1195",
821", "823", "826", "828-846", "848-851", "853-856", "858", "861-862", "865-866", "868-869", "872-875", "877", "879"1808-1810",
881", "884-886", "888-890", "893", "896", "900", "905", "907-908", "910", "913-914", "918-919", "921-922", "925-927",
"1813-1814",
"934-938", "940", "944-945", "949", "957-961", "966", "969-973", "977-978", "980", "982-988", "992-996", "998-1000",
"1854-1858",
"1002", "1006", "1010-1012", "1015", "1019", "1027-1029", "1032", "1039-1040", "1043-1046", "1048-1050", "1053",
"2096-2099",
"1055-1056", "1060-1064", "1066", "1069-1071", "1073", "1076", "1080-1081", "1084", "1087-1088", "1092", "1103",
"2198-2203"]
"1105", "1112-1115", "1117-1119", "1122-1123", "1128", "1133", "1136-1138", "1141-1142", "1144-1145", "1150", "11521153", "1162", "1166-1167", "1170", "1174-1175", "1177-1178", "1180", "1184-1187", "1189", "1192", "1195", "1197",
"1200-1205", "1210", "1212-1214", "1218", "1220-1225", "1228", "1234", "1237", "1242", "1244", "1246-1248", "1252",
"1254", "1256-1257", "1259-1260", "1262-1263", "1266", "1270", "1272-1273", "1276-1277", "1279-1281", "1283", "12851286", "1288", "1291-1292", "1296-1299", "1303", "1305", "1307-1308", "1317", "1322-1323", "1327-1328", "1333-1334",
"1337", "1339-1340", "1344", "1351-1353", "1355", "1359", "1361", "1376-1377", "1383", "1387", "1394", "1404", "1407",
"1410-1411", "1415", "1418-1420", "1422", "1424-1426", "1429", "1431-1434", "1436", "1438", "1441-1442", "1444",
"1446-1450", "1452", "1454-1455", "1457-1459", "1467", "1471-1474", "1477-1483", "1489", "1491-1492", "1497", "14991502", "1505", "1507", "1510-1513", "1516-1519", "1521", "1523", "1525", "1527", "1529", "1531-1533", "1535-1536",
"1538-1539", "1544", "1546-1548", "1551", "1555-1556", "1559", "1562", "1567", "1569", "1571-1572", "1574", "1577",
"1580", "1583-1586", "1588-1590", "1592", "1605", "1612-1614", "1637", "1641", "1649", "1651", "1656-1659", "1661",
"1668", "1731", "1740-1741", "1745-1746", "1749", "1751", "1753-1754", "1756", "1758", "1763", "1773", "1776-1777",
"1783-1784", "1787-1788", "1797", "1801-1803", "1807", "1809", "1812", "1818-1820", "1824-1825", "1845", "18471851", "1853-1858", "1862", "1864", "1867", "1869", "1873", "1875", "1877-1879", "1887", "1892", "1895", "1900-1901",
"1906", "1911", "1919", "1939", "1944", "1946-1949", "1951-1952", "1955-1956", "1967", "1970", "1991", "1995", "20252026", "2029", "2034", "2038", "2048", "2051", "2065", "2069", "2085", "2089-2091", "2095", "2098-2099", "2131-2132",
"2134", "2136", "2154-2155", "2180", "2182", "2186", "2188-2192", "2194-2195"]
[]
0
0.00% ["1-6", "4849", "65-67",
"309", "311",
"619", "684692", "743751", "754762", "982998", "10141015", "11721196",
"2004"]
["7-10", "16", "18", "24", "26", "28-29", "41", "61-62", "90-91", "94", "124", "169", "172", "174", "178", "182", "185",
658 29.09% ["1-3", "6",
"188-189", "192-193", "219-221", "232", "237", "241", "243", "245", "247-248", "251", "253", "255", "259", "261-262",
"140-148",
"268", "276", "280", "283", "285", "287-288", "293-294", "296-297", "299", "302-304", "306", "311", "313", "316", "318"152-157",
319", "322-323", "327", "349", "354-356", "360-364", "366-367", "372-374", "378-379", "383-384", "390", "395-396",
"502-509",
"401-403", "406", "408", "411-412", "415-417", "420", "423-424", "426", "433-434", "439-440", "442-443", "446-447",
"621-655",
"450", "452-454", "460-462", "466-468", "471-475", "479-480", "482", "485-487", "490", "509", "512", "514-515", "517"788-790",
518", "521", "526", "529-530", "532-534", "536", "538", "541", "544-546", "548-550", "553", "555-556", "558", "561",
"1163-1164",
"564-569", "571", "574", "576", "578", "580", "582-585", "591-595", "597-598", "601-603", "605-607", "609-614", "658"1207-1220",
659", "663", "665", "671-675", "678-683", "686-691", "694", "696", "702", "704-705", "710-711", "713-714", "716", "727",
"1284-1290",
"733-739", "745-746", "753-754", "756", "758-761", "765", "767-774", "779", "781", "783-785", "795-796", "806-807",
"1754-1758",
"811", "813-814", "823-824", "827-828", "830", "832-835", "838", "840-842", "844-846", "848-850", "854-873", "875-877",
"1764-1769",
"879-884", "888", "891-892", "894-895", "898", "901-902", "905-906", "910-911", "914-916", "934", "936", "938-940",
"1842-1845",
"943-948", "951", "953", "955-957", "959", "962-967", "969", "976", "981", "983", "985-987", "989-990", "992", "994"1848-1849",
999", "1002-1006", "1010-1014", "1017-1018", "1021-1022", "1028", "1031", "1033", "1035-1038", "1049", "1051-1055",
"2247-2248",
"1058-1059", "1061-1067", "1070-1076", "1079-1081", "1083-1084", "1086-1090", "1096-1097", "1102", "1106-1107",
"2254-2261"]
"1109-1110", "1113-1115", "1118", "1138-1139", "1144", "1150-1151", "1159", "1161", "1164-1166", "1168", "1170-
207
9.49%
["60",
"180",
"493",
"593598",
"990",
"1019",
"1022",
"10261029",
"11511153",
"1155",
"12031204",
"12061207",
"1212",
"12141218",
"1220",
"12761277",
"12801281",
"12851287",
"1291",
"1453",
"18271829",
"2076",
"20802081"]
["11",
"1314",
"156",
"631632",
"634",
"689",
"888890",
"893",
"1010",
"10491050",
"11771178",
"1180",
"11841187",
"1189",
"1192",
"1195",
"1809",
"18541858",
"20982099"]
46
2.11%
121
5.49%
33
1.50%
86
4.29%
[]
0
0.00%
115
5.08%
["509",
"1164",
"12151217",
"12191220",
"12841285",
"12871290"]
13
0.57%
141
HPV1
HPV3
HRSVA2
HRSVB1
HRSVS2
HV
1172", "1175", "1178", "1190", "1205", "1215-1217", "1219-1220", "1225", "1227-1231", "1233", "1235", "1242", "12451247", "1249", "1251-1253", "1255", "1257-1259", "1263", "1265-1266", "1272-1274", "1281", "1284-1285", "1287-1290",
"1293-1294", "1298-1305", "1307", "1311", "1316", "1320", "1322-1323", "1325", "1327", "1336-1337", "1340", "1342",
"1344-1346", "1350", "1352", "1356", "1365", "1367-1372", "1379", "1381", "1402", "1404-1407", "1410", "1415", "1431",
"1450-1452", "1457", "1466", "1470", "1472-1477", "1479", "1481-1482", "1487", "1496", "1508", "1510-1511", "1520",
"1527", "1529-1530", "1535-1541", "1546-1548", "1553", "1557", "1560", "1563-1564", "1566", "1572", "1575", "1583",
"1587", "1633", "1641", "1675", "1678-1679", "1682-1685", "1687", "1691", "1780-1782", "1784", "1788", "1791", "1798",
"1806", "1808-1809", "1818", "1822-1823", "1832", "1834", "1836-1837", "1853-1855", "1880-1884", "1886", "18891892", "1896-1897", "1900", "1904", "1908", "1910", "1912-1913", "1918", "1930", "1935", "1937", "1940-1941", "1949",
"1954", "1979", "1981-1984", "1986-1987", "1992", "2007", "2061", "2068-2070", "2118", "2122-2123", "2128-2129",
"2164", "2188", "2213", "2220-2225", "2227"]
["13-16", "22", "24", "30", "32", "34-35", "47", "64", "90-91", "94", "124", "163", "166", "168", "172", "176", "179", "182- 655 29.46% ["0-13", "253183", "186-187", "215-217", "228", "233", "237", "239", "241", "243-244", "247", "249", "251", "255", "257-258", "264",
261", "601"272", "276", "279", "281", "283-284", "289-290", "292-293", "295", "298-300", "302", "307", "309", "312", "314-315",
610", "646"318-319", "323", "345", "350-352", "356-360", "362-363", "368-370", "374", "376", "380-381", "387", "392-393", "398650", "1032",
400", "403", "405", "408-409", "412-414", "417", "420-421", "423", "430-431", "436-437", "439-440", "443-444", "447",
"1034-1039",
"449-451", "458-459", "463-465", "468-472", "476-477", "479", "482-484", "487", "500", "503", "505-506", "508-509",
"1286-1287",
"512", "517", "520-521", "523-525", "527", "529", "532", "535-537", "539-541", "544", "546-547", "549", "552", "555"1625-1627",
560", "562", "565", "567", "569", "571", "573-576", "582-586", "588-589", "592-594", "596-598", "600-605", "653-654",
"1719-1722",
"658", "660", "666-670", "673-678", "681-686", "689", "691", "697", "699-700", "705-706", "708-709", "711", "722", "728"1747-1752",
732", "734-735", "741-742", "749-750", "752", "754-757", "761", "763-770", "775", "777", "779-781", "791-792", "802"1969-1973",
803", "807", "809-810", "819-820", "823-824", "826", "828-831", "834", "836-838", "840-842", "844-846", "850-869",
"2033-2035",
"871-873", "875-880", "884", "887-888", "890-891", "894", "897-898", "901-902", "906-907", "910-912", "930", "932",
"2206-2222"]
"934-936", "939-944", "947", "949", "951-953", "955", "958-963", "965", "977", "979", "981-983", "985-986", "988", "991995", "998-1002", "1006-1010", "1013-1014", "1017-1018", "1024", "1027", "1029", "1031-1034", "1045", "1047-1051",
"1054-1055", "1057-1063", "1066-1072", "1075-1077", "1079-1080", "1082-1086", "1092-1093", "1098", "1101-1103",
"1105-1106", "1109-1111", "1114", "1130-1131", "1136", "1142-1143", "1151", "1153", "1156-1158", "1160", "11621164", "1167", "1170", "1199", "1209-1211", "1213-1214", "1219", "1221-1225", "1227", "1229", "1236", "1239-1241",
"1243", "1245-1247", "1249", "1251-1253", "1257", "1259-1260", "1266-1268", "1275", "1278-1279", "1281-1284", "12871288", "1292-1299", "1301", "1305", "1310", "1314", "1316-1317", "1321", "1330-1331", "1334", "1336", "1338-1340",
"1344", "1346", "1350", "1359", "1361-1366", "1373", "1375", "1396", "1398-1401", "1404", "1409", "1423", "1444-1446",
"1451", "1460", "1464", "1466-1471", "1473", "1475-1476", "1481", "1489", "1502", "1504-1505", "1514", "1521", "15231524", "1529-1535", "1540-1542", "1547", "1551", "1554", "1557-1558", "1560", "1566", "1569", "1577", "1581", "1627",
"1635", "1640", "1663", "1666-1667", "1670-1674", "1678", "1771-1773", "1775", "1779", "1782", "1789", "1799-1800",
"1809", "1813-1814", "1823", "1825", "1827-1828", "1844-1846", "1863", "1865-1869", "1871", "1874-1877", "18811882", "1885", "1889", "1893", "1895", "1897-1898", "1903", "1915", "1920", "1922", "1925-1926", "1934", "1939",
"1964", "1966-1969", "1971-1972", "1977", "1992", "2040", "2047-2048", "2052", "2101", "2105-2106", "2111-2112",
"2147", "2155", "2167", "2186", "2193-2198", "2200"]
["38-41", "47", "49", "55", "57", "59-60", "72", "89", "115-116", "119", "149", "188", "191", "193", "197", "201", "204",
651 28.83% ["0-34", "36",
"207-208", "211-212", "240-242", "253", "258", "262", "264", "266", "268-269", "272", "274", "276", "280", "282-283",
"631-632",
"289", "297", "301", "304", "306", "308-309", "314-315", "317-318", "320", "323-325", "327", "332", "334", "337", "339"656-665",
340", "343-344", "348", "370", "375-377", "381-385", "387-388", "393-395", "399", "401", "405-406", "412", "417-418",
"741-744",
"423-425", "428", "430", "433-434", "437-439", "442", "445-446", "448", "455-456", "461-462", "464-465", "468-469",
"890-891",
"472", "474-476", "483-484", "488-490", "493-497", "501-502", "504", "507-509", "512", "525", "528", "530-531", "533"1050-1051",
534", "537", "542", "545-546", "548-549", "552", "554", "557", "560-562", "564-566", "569", "571-572", "574", "577",
"1343-1345",
"580-585", "587", "590", "592", "594", "596", "598-601", "607-611", "613-614", "617-619", "621-623", "625-630", "678"1352-1353",
679", "683", "685", "691-695", "698-703", "706-707", "709-711", "714", "716", "722", "724-725", "730-731", "733-734",
"1770-1785",
"736", "747", "753-757", "759-760", "766-767", "774-775", "777", "779-782", "786", "788-795", "800", "802", "804-806",
"2244",
"816-817", "827-828", "832", "834-835", "844-845", "848-849", "851", "853-856", "861-863", "865-867", "869-871", "875"2246",
894", "896-898", "900-905", "909", "912-913", "915-916", "919", "922-923", "926-927", "931-932", "935-937", "955",
"2248-2257"]
"957", "959-961", "964-969", "972", "974", "976-978", "980", "983-988", "990", "1002", "1004", "1006-1008", "10101011", "1013", "1016-1020", "1023-1027", "1031-1035", "1038-1039", "1042-1043", "1049", "1052", "1054", "1056-1059",
"1070", "1072-1076", "1079", "1082-1088", "1091-1097", "1100-1102", "1104-1105", "1107-1111", "1113", "1117-1118",
"1123", "1127-1128", "1130-1131", "1134-1136", "1139", "1155-1156", "1161", "1167-1168", "1176", "1178", "11811183", "1185", "1187-1189", "1192", "1195", "1224", "1234-1236", "1238-1239", "1244", "1246-1250", "1252", "1254",
"1261", "1264-1266", "1268", "1270-1272", "1274", "1276-1278", "1282", "1284-1285", "1291-1293", "1300", "13031304", "1306-1309", "1312-1313", "1317-1324", "1326", "1330", "1335", "1339", "1341-1342", "1344", "1346", "13551356", "1359", "1361", "1363-1365", "1369", "1371", "1375", "1384", "1386-1391", "1398", "1400", "1421", "1423-1426",
"1429", "1448", "1469-1471", "1476", "1485", "1489", "1491-1498", "1500-1501", "1506", "1514", "1527", "1529-1530",
"1539", "1546", "1548-1549", "1554-1560", "1565-1566", "1572", "1576", "1579", "1582-1583", "1585", "1591", "1594",
"1602", "1606", "1652", "1660", "1688", "1691", "1695-1699", "1703", "1800-1802", "1804", "1808", "1811", "1818",
"1826", "1828-1829", "1831", "1838", "1842-1843", "1852", "1854", "1856-1857", "1873-1875", "1892", "1894-1898",
"1900", "1903-1906", "1910-1911", "1914", "1918", "1922", "1924", "1926-1927", "1932", "1944", "1949", "1951", "19541955", "1963", "1968", "1993", "1995-1998", "2000-2001", "2006", "2021", "2069", "2076-2077", "2081", "2130", "21342135", "2140-2141", "2176", "2184", "2196", "2215", "2222-2227", "2229"]
[]
0
0.00% ["1-4", "7-8",
"135-149",
"169-182",
"1249-1276",
"2160",
"2162-2164"]
[]
0
0.00% ["1-4", "7-8",
"135-149",
"168-179",
"1155-1159",
"1249-1276",
"1716-1718",
"1720-1726",
"1749-1752",
"2161-2165"]
[]
0
0.00% ["1-4", "7-8",
"135-149",
"172-183",
"1249-1276",
"1716",
"1762",
"2160",
"2162-2164"]
["8", "10-11", "14", "20", "22", "25", "27", "29-32", "34", "36", "42", "44-45", "48", "52", "59", "64", "87", "91", "120-121", 795 35.43% ["1-4", "37"153", "160", "163", "165", "169", "171", "173-179", "183-184", "209-211", "214", "216", "227-229", "232", "238", "240",
45", "602"246-248", "250-251", "254", "256-257", "267-268", "271", "274-275", "278", "280", "282", "284", "286", "288-289", "291620", "648",
292", "294", "297", "301-302", "306", "309", "311", "313-314", "322", "344", "348-351", "353", "355-359", "361-362",
"650-674",
"368-369", "375", "379", "381-382", "386", "388", "390-395", "397", "399", "401-402", "406", "408-411", "413", "416",
"687-710",
"419-420", "422-424", "430", "433-439", "442", "445-451", "456-465", "467-470", "475-479", "481", "483-487", "491",
"776-777",
"495", "499", "501-502", "505", "507", "511", "515", "519", "523-524", "526", "528", "534-536", "545", "547-548", "551",
"1061",
"555", "557", "559", "561-562", "565-566", "568-570", "572-575", "579-580", "583-589", "591", "593-597", "599-600",
"1063-1064",
"602-604", "712-713", "715", "718", "723", "725", "728", "731-736", "738-745", "747-750", "752", "754", "756", "759",
"1081-1084",
"763", "766", "769-770", "772", "780-781", "785", "787-789", "791-794", "800-801", "804", "807-809", "811-814", "818"1180-1181",
822", "824-827", "834", "839-840", "843", "850-851", "854", "861", "866", "868-869", "878-879", "883", "885-887", "893"1266-1287",
897", "900-902", "904", "907-927", "929-930", "932", "934-937", "939", "942-943", "946-947", "949-950", "953-956",
"1340-1346",
"958", "960-962", "965-967", "969-971", "974", "977", "981", "986", "988-989", "991", "994-995", "999-1000", "1002"1349",
1003", "1006-1008", "1015-1019", "1021", "1025-1026", "1030", "1036", "1038-1042", "1047", "1050-1054", "1058-1059",
"1353-1360",
"1061", "1063-1069", "1073-1077", "1079-1081", "1083", "1087", "1091-1093", "1096", "1100", "1108-1109", "1113",
"1467-1469",
"1120-1121", "1124-1127", "1129-1131", "1134", "1136-1137", "1141-1145", "1147", "1150-1152", "1154", "1157", "1160"1802-1803",
1162", "1165", "1168-1170", "1173", "1182", "1184", "1191-1194", "1196-1198", "1201-1202", "1207", "1212", "1215"1883",
1217", "1220-1221", "1223-1224", "1229", "1231-1232", "1245-1246", "1249", "1253-1254", "1256", "1259", "1263-1266",
"1891-1896",
"1268", "1271", "1274", "1276", "1279-1284", "1289", "1291-1293", "1297", "1299-1304", "1307", "1313", "1316", "1321",
"2133-2136",
85
3.82%
["13",
"255",
"257258",
"601605",
"1032",
"1034",
"1287",
"1627",
"1969",
"19711972"]
16
0.72%
89
3.94%
["890891",
"1344"]
3
0.13%
67
3.09%
[]
0
0.00%
85
3.92%
[]
0
0.00%
67
3.09%
[]
0
0.00%
153
6.82%
["42",
"4445",
"602604",
"1061",
"10631064",
"1081",
"1083",
"1266",
"1268",
"1271",
"1274",
"1276",
"12791284",
"13411342",
35
1.56%
142
JV
MENV
MeV
MOSV
"1323", "1325-1327", "1331", "1333", "1335-1336", "1338-1339", "1341-1342", "1345", "1349", "1351-1352", "13551356", "1358-1360", "1362", "1364-1365", "1367", "1370-1371", "1375-1378", "1380", "1382", "1384", "1386-1387",
"1396", "1401-1402", "1406-1407", "1412-1413", "1416", "1418-1419", "1423", "1429-1432", "1434", "1438", "1440",
"1455-1456", "1462", "1466", "1473", "1481", "1486", "1489-1490", "1494", "1497-1499", "1501", "1503-1505", "1508",
"1510-1513", "1515", "1517", "1520-1523", "1525-1529", "1531", "1533-1534", "1536-1538", "1545-1546", "1550", "15521553", "1556-1562", "1568", "1570-1571", "1576", "1578-1581", "1584", "1586", "1589-1592", "1595-1598", "1600",
"1602", "1604", "1606", "1608", "1610-1612", "1614-1615", "1617-1618", "1623", "1625-1627", "1630", "1634-1635",
"1638", "1641", "1646", "1648", "1650-1651", "1653", "1656", "1659", "1662-1665", "1667-1669", "1671", "1684", "1686",
"1692-1693", "1714", "1718", "1720", "1726", "1728", "1730", "1732-1735", "1738", "1744", "1805-1806", "1810-1811",
"1814", "1816", "1818-1819", "1821", "1823", "1828", "1838", "1841-1842", "1848-1849", "1852-1853", "1862", "18661868", "1872", "1874", "1877", "1883-1885", "1887-1888", "1902", "1904", "1906-1910", "1912", "1914-1917", "1921",
"1926", "1928", "1932", "1934", "1936-1938", "1946", "1951", "1954", "1959-1960", "1965", "1970", "1978", "1998",
"2003", "2005-2008", "2010-2011", "2015", "2028", "2031", "2050", "2054", "2084-2085", "2088", "2093", "2097", "2107",
"2112", "2121", "2126", "2140", "2144-2146", "2150", "2153-2154", "2186-2187", "2189", "2191", "2194", "2205-2206",
"2227", "2229", "2231", "2233", "2235-2240", "2242"]
["15-18", "24", "26", "32", "34", "36-37", "49", "69", "92-93", "96", "126", "165", "168", "170", "174", "178", "181", "184185", "188-189", "217-219", "230", "235", "239", "241", "243", "245-246", "249", "251", "253", "257", "259-260", "266",
"274", "278", "281", "283", "285-286", "291-292", "294-295", "297", "300-302", "304", "309", "311", "314", "316-317",
"320-321", "325", "347", "352-354", "358-362", "364-365", "370-372", "376", "378", "382-383", "389", "394-395", "400402", "405", "407", "410-411", "414-416", "419", "422-423", "425", "432-433", "438-439", "441-442", "445-446", "449453", "460-461", "465-467", "470-474", "478-479", "481", "484-486", "489", "502", "505", "507-508", "510-511", "514",
"519", "522-523", "525-526", "529", "531", "534", "537-539", "541-543", "546", "548-549", "551", "554", "557-562",
"564", "567", "569", "571", "573", "575-578", "584-588", "590-591", "594-596", "598-600", "602-607", "667-668", "672",
"674", "680-684", "687-692", "695-696", "698-700", "703", "705", "711", "713-714", "719-720", "722-723", "725", "736",
"742-746", "748-749", "755-756", "763-764", "766", "768-771", "775", "777-784", "789", "791", "793-795", "805-806",
"816-817", "821", "823-824", "833-834", "837-838", "840", "842-845", "848", "850-852", "854-856", "858-860", "862",
"864-883", "885-887", "889-894", "898", "901-902", "904-905", "908", "911-912", "915-916", "920-921", "924-926", "944",
"946", "948-950", "953-958", "961", "963", "965-967", "969", "972-977", "979", "986", "991", "993", "995-997", "9991000", "1002", "1005-1009", "1012-1016", "1020-1024", "1027-1028", "1031-1032", "1038", "1041", "1043", "1045-1048",
"1059", "1061-1065", "1068-1069", "1071-1077", "1080-1086", "1089-1091", "1093-1094", "1096-1100", "1106-1107",
"1112", "1115-1117", "1119-1120", "1123-1125", "1128", "1144-1145", "1150", "1156-1157", "1165", "1167", "11701172", "1174", "1176-1178", "1181", "1184", "1196", "1211", "1221-1223", "1225-1226", "1231", "1233-1237", "1239",
"1241", "1248", "1251-1253", "1255", "1257-1259", "1261", "1263-1265", "1269", "1271-1272", "1278-1280", "1287",
"1290-1291", "1293-1296", "1299-1300", "1304-1311", "1313", "1317", "1322", "1326", "1328-1329", "1331", "1333",
"1342-1343", "1346", "1348", "1350-1352", "1356", "1358", "1362", "1371", "1373-1378", "1385", "1387", "1408", "14101413", "1416", "1421", "1435", "1456-1458", "1463", "1472", "1476", "1478-1483", "1485", "1487-1488", "1493", "1501",
"1514", "1516-1517", "1526", "1533", "1535-1536", "1541-1547", "1552-1554", "1559", "1563", "1566", "1569-1570",
"1572", "1578", "1581", "1589", "1593", "1639", "1647", "1652", "1675", "1678", "1682-1686", "1690", "1769-1771",
"1773", "1777", "1780", "1787", "1797-1798", "1800", "1807", "1811-1812", "1821", "1823", "1825-1826", "1842-1844",
"1861", "1863-1867", "1869", "1872-1875", "1879-1880", "1883", "1887", "1891", "1893", "1895-1896", "1901", "1913",
"1918", "1920", "1923-1924", "1932", "1937", "1962", "1964-1967", "1969-1970", "1975", "1990", "2042", "2049-2050",
"2054", "2097", "2101-2102", "2107-2108", "2143", "2151", "2163", "2186", "2192-2197"]
["7-10", "16", "18", "24", "26", "28-29", "41", "61", "65", "94-95", "98", "128", "175", "178", "180", "184", "188", "191",
"194-195", "198-199", "225-227", "238", "243", "247", "249", "251", "253-254", "257", "259", "261", "265", "267-268",
"274", "282", "286", "289", "291", "293-294", "299-300", "302-303", "305", "308-310", "312", "317", "319", "322", "324325", "328-329", "333", "355", "360-362", "366-370", "372-373", "378-380", "384", "386", "390-391", "397", "402-403",
"408-410", "413", "415", "418-419", "422-424", "427", "430-431", "433", "441", "446-447", "449-450", "453-454", "457",
"459-461", "468-469", "473-475", "478-482", "486-487", "489", "492-494", "497", "516", "519", "521-522", "524-525",
"528", "533", "536-537", "539-541", "543", "545", "548", "551-553", "555-557", "560", "562-563", "565", "568", "571576", "578", "581", "583", "585", "589-592", "598-602", "604-605", "608-610", "612-614", "616-621", "665-666", "670",
"672", "678-682", "685-690", "693-698", "701", "703", "709", "711-712", "717-718", "720-721", "723", "734", "740-744",
"746-747", "753-754", "761-762", "764", "766-769", "773", "775-782", "787", "789", "791-793", "803-804", "814-815",
"819", "821-822", "831-832", "835-836", "838", "840-843", "848-850", "852-854", "856-858", "862-881", "883-885", "887892", "896", "899-900", "902-903", "906", "909-910", "913-914", "918-919", "922-924", "942", "944", "946-948", "951956", "959", "961", "963-965", "967", "970-975", "977", "989", "991", "993-995", "997-998", "1000", "1002-1007", "10101014", "1018-1022", "1025-1026", "1029-1030", "1036", "1039", "1041", "1043-1046", "1057", "1059-1063", "1066-1067",
"1069-1075", "1078-1084", "1087-1089", "1091-1092", "1094-1098", "1104-1105", "1110", "1114-1115", "1117-1118",
"1121-1123", "1126", "1148-1149", "1154", "1160-1161", "1169", "1171", "1174-1176", "1178", "1180-1182", "1185",
"1188", "1200", "1215", "1225-1227", "1229-1230", "1235", "1237-1241", "1243", "1245", "1252", "1255-1257", "1259",
"1261-1263", "1265", "1267-1269", "1273", "1275-1276", "1282-1284", "1291", "1294-1295", "1297-1300", "1303-1304",
"1308-1315", "1317", "1321", "1326", "1330", "1332-1333", "1337", "1346-1347", "1350", "1352", "1354-1356", "1360",
"1362", "1366", "1375", "1377-1382", "1389", "1391", "1412", "1414-1417", "1420", "1425", "1441", "1460-1462", "1467",
"1476", "1480", "1482-1487", "1489", "1491-1492", "1497", "1506", "1518", "1520-1521", "1530", "1537", "1539-1540",
"1545-1551", "1556-1558", "1563", "1567", "1570", "1573-1574", "1576", "1582", "1585", "1593", "1597", "1643", "1651",
"1685", "1688", "1692-1695", "1697", "1701", "1791-1793", "1795", "1799", "1802", "1809", "1819-1820", "1829", "18331834", "1843", "1845", "1847-1848", "1864-1866", "1889-1893", "1895", "1898-1901", "1905-1906", "1909", "1913",
"1917", "1919", "1921-1922", "1927", "1939", "1944", "1946", "1949-1950", "1958", "1963", "1988", "1990-1993", "19951996", "2001", "2016", "2070", "2077-2078", "2082", "2127", "2131-2132", "2137-2138", "2173", "2197", "2222", "22292234", "2236"]
["7", "9-10", "13", "16", "19", "21", "24", "26", "28-31", "33", "35", "41", "43-44", "47", "51", "58", "63", "86", "90", "119120", "133", "148", "155", "158", "160", "164", "166", "168-174", "178-179", "204-206", "209", "211", "220-222", "225",
"231", "233", "239-241", "243-244", "247", "249-250", "260-261", "264", "267-268", "273", "275-276", "279", "281-282",
"284-285", "287", "290", "294-295", "299", "302", "304", "306-307", "315", "337", "341-344", "346", "348-352", "354355", "361-362", "368", "372", "374-375", "379", "381", "383-390", "392", "394-395", "399", "401-404", "406", "409",
"412-413", "415-417", "422-423", "426-432", "435-436", "438-444", "449-458", "460-463", "468-472", "474", "476-480",
"484", "488", "492", "494-495", "498", "500", "504", "508", "512", "516-517", "519", "521", "527-529", "538", "540-541",
"544", "548", "550-552", "554-555", "558-559", "561-563", "565-568", "572-573", "576-584", "586-590", "592-597", "653654", "656", "659", "664", "666", "669", "672-677", "679-686", "688-691", "693", "695", "697", "700", "704-705", "707",
"711", "713", "721-722", "726", "728-735", "741-742", "745", "748-750", "752-755", "757", "759-763", "765-768", "775",
"780-781", "784", "791-792", "795", "799", "802", "807", "809-810", "819-820", "824", "826-828", "834-838", "841-843",
"845", "848", "850-868", "870-873", "875-878", "880", "883-884", "887-888", "890-891", "894-897", "899", "901-903",
"906-908", "910-912", "915", "918", "922", "927", "929-930", "932", "935-936", "940-941", "943-944", "947-949", "956960", "962", "966-967", "971", "979-983", "988", "991-995", "999-1000", "1002", "1004-1010", "1012", "1014-1018",
"1020-1022", "1024", "1028", "1032-1034", "1037", "1041", "1049-1051", "1054", "1061-1062", "1065-1068", "10701072", "1075", "1077-1078", "1082-1086", "1088", "1091-1093", "1095", "1098", "1101-1103", "1106", "1109-1110",
"1114", "1123", "1125", "1132-1135", "1137-1139", "1141-1143", "1148", "1153", "1156-1158", "1161", "1164-1165",
"1170", "1172-1173", "1186-1187", "1190", "1194-1195", "1197-1198", "1200", "1204-1207", "1209", "1212", "1215",
"1217", "1220-1225", "1230", "1232-1234", "1238", "1240-1245", "1248", "1254", "1257", "1262", "1264", "1266-1268",
"1272", "1274", "1276-1277", "1279-1280", "1282-1283", "1286", "1290", "1292-1293", "1296-1297", "1299-1301",
"1303", "1305-1306", "1308", "1311-1312", "1316-1319", "1323", "1325", "1327-1328", "1337", "1342-1343", "13471348", "1353-1354", "1357", "1359-1360", "1364", "1370-1373", "1375", "1379", "1381", "1396-1397", "1403", "1407",
"1414", "1422", "1427", "1430-1431", "1435", "1438-1440", "1442", "1444-1446", "1449", "1451-1454", "1458", "14611464", "1466-1470", "1472", "1474-1475", "1477-1479", "1487", "1491", "1493-1494", "1497-1503", "1509", "1511-1512",
"1517", "1519-1522", "1525", "1527", "1530-1533", "1536-1539", "1541", "1543", "1545", "1547", "1549", "1551-1553",
"1555-1556", "1558-1559", "1564", "1566-1568", "1571", "1575-1576", "1579", "1582", "1589", "1591", "1593-1594",
"1596", "1599", "1602", "1606-1608", "1610-1612", "1614", "1627", "1634-1636", "1657", "1661", "1663", "1669", "1671",
"1673", "1675-1678", "1681", "1688", "1739", "1750-1751", "1755-1756", "1759", "1761", "1763-1764", "1766", "1768",
"1773", "1783", "1786-1787", "1793-1794", "1797-1798", "1807", "1811-1813", "1817", "1819", "1822", "1828-1830",
"1832-1833", "1843", "1845", "1847-1851", "1853", "1855-1858", "1862", "1864", "1867", "1869", "1873", "1875", "18771879", "1887", "1892", "1895", "1900-1901", "1906", "1911", "1919", "1939", "1944", "1946-1949", "1951-1952", "19551956", "1969", "1972", "1991", "1995", "2023-2024", "2027", "2032", "2036", "2046", "2051", "2060", "2065", "2079",
"2083-2085", "2089", "2092-2093", "2123-2124", "2126", "2128", "2131", "2142-2143", "2164", "2166", "2168", "2170",
"2172-2177", "2179"]
["9-12", "18", "20", "26", "28", "30-31", "43", "63", "86-87", "90", "120", "159", "162", "164", "168", "172", "175", "178179", "182-183", "211-213", "224", "229", "233", "235", "237", "239-240", "243", "245", "247", "251", "253-254", "260",
"268", "272", "275", "277", "279-280", "285-286", "288-289", "291", "294-296", "298", "303", "305", "310-311", "314315", "319", "341", "346-348", "352-356", "358-359", "364-366", "370", "372", "376-377", "383", "388-389", "394-396",
"2141-2146"]
"1345",
"1349",
"13551356",
"13581360",
"1883",
"21442146"]
657 29.81% ["1-5", "149150", "152",
"154-159",
"370", "420421", "428429", "492494", "496504", "608663", "1230",
"1232-1234",
"1294-1299",
"1302",
"1309-1319",
"1756-1765",
"1897-1904",
"2203"]
128
5.81%
["370",
"502",
"12331234",
"12941296",
"1299",
"13091311",
"1313",
"1317",
"1901"]
14
0.64%
650 28.65%
["0-3", "7",
"622-661",
"723", "729730", "733734", "798801", "875877", "11701171", "12201223", "12261229", "12331236", "1250",
"1290-1291",
"1293",
"1546-1552",
"1730-1731",
"1733-1736",
"1770",
"1773-1774",
"1780",
"1850-1851",
"1854-1862",
"2135-2138",
"2262-2263",
"2265-2268"]
113
4.98%
["7",
"723",
"734",
"875877",
"1171",
"12261227",
"1229",
"1235",
"1291",
"15461551",
"21372138"]
20
0.88%
805 36.88%
["1-3", "5",
"8", "85-91",
"136-137",
"595-623",
"625-626",
"637-649",
"1033-1034",
"1214-1217",
"1278-1284",
"1294-1296",
"1710-1712",
"1812-1823",
"1826",
"2182"]
91
4.17%
["86",
"90",
"595597",
"10331034",
"1215",
"1217",
"12791280",
"12821283",
"1296",
"18121813",
"1817",
"1819",
"1822"]
19
0.87%
661 29.98% ["0-9", "139146", "188195", "414",
"579", "598-
131
5.94%
["9",
"579",
"598601",
25
1.13%
143
MuV
NCDV
NIPH
"399", "401", "404-405", "408-410", "413", "416-417", "419", "426-427", "432-433", "435-436", "439-440", "443-447",
603", "607"453-455", "459-461", "464-468", "472-473", "475", "478-480", "483", "496", "499", "501-502", "504-505", "508", "513",
638", "642"516-517", "519-521", "523", "525", "528", "531-533", "535-537", "540", "542-543", "545", "548", "551-556", "558",
643", "646"561", "563", "565", "569-572", "578-582", "584-585", "588-590", "592-594", "596-601", "654", "657-658", "662", "664",
647", "1037",
"670-674", "677-682", "685-690", "693", "695", "701", "703-704", "709-710", "712-713", "715", "726", "732-736", "738"1218-1233",
739", "745-746", "753-754", "756", "758-761", "765", "767-774", "779", "781", "783-785", "795-796", "806-807", "811",
"1286",
"813-814", "823-824", "827-828", "830", "832-835", "838", "840-842", "844-846", "848-850", "852", "854-873", "875-877",
"1288-1295",
"879-884", "888", "891-892", "894-895", "898", "901-902", "905-906", "910-911", "914", "916", "933-934", "936", "938"1303-1308",
940", "943-948", "951", "953", "955-957", "959", "962-966", "969", "976", "981", "983", "985-987", "989-990", "992",
"1386-1393",
"995-999", "1002-1006", "1010-1014", "1017-1018", "1021-1022", "1028", "1031", "1033", "1035-1038", "1049", "1051"1464-1469",
1055", "1058-1059", "1061-1067", "1070-1076", "1079-1081", "1083-1084", "1086-1090", "1092", "1096-1097", "1102",
"1720-1724",
"1105-1107", "1109-1110", "1113-1115", "1118", "1134-1135", "1140", "1146-1147", "1155", "1157", "1160-1162",
"1742-1745",
"1164", "1166-1168", "1171", "1174", "1186", "1201", "1211-1213", "1215-1216", "1221", "1223-1227", "1229", "1231",
"2131-2133",
"1238", "1241-1243", "1245", "1247-1249", "1251", "1253-1255", "1259", "1261-1262", "1268-1270", "1277", "1280"2202-2204"]
1281", "1283-1286", "1289-1290", "1294-1301", "1303", "1307", "1309", "1312", "1316", "1318-1319", "1321", "1323",
"1332-1333", "1336", "1338", "1340-1342", "1346", "1348", "1352", "1361", "1363-1368", "1375", "1377", "1398", "14001403", "1406", "1411", "1425", "1446-1448", "1453", "1462", "1466", "1468-1473", "1475", "1477-1478", "1483", "1491",
"1504", "1506-1507", "1516", "1523", "1525-1526", "1531-1537", "1542-1544", "1549", "1553", "1556", "1559-1560",
"1562", "1568", "1571", "1579", "1583", "1631", "1639", "1644", "1667", "1670-1671", "1674-1678", "1682", "1763-1765",
"1767", "1771", "1774", "1781", "1791-1792", "1794", "1801", "1805-1806", "1815", "1817", "1819-1820", "1822", "18361838", "1855", "1857-1861", "1863", "1866-1869", "1873-1874", "1877", "1881", "1885", "1887", "1889-1890", "1895",
"1907", "1912", "1914", "1918", "1926", "1931", "1956", "1958-1961", "1963-1964", "1969", "1984", "2042", "2049-2050",
"2054", "2097", "2101-2102", "2107-2108", "2143", "2151", "2163", "2186", "2192-2197"]
["7-10", "16", "18", "24", "26", "28-29", "41", "61-62", "90-91", "94", "124", "169", "172", "174", "178", "182", "185",
647 28.62%
["1", "6",
"188-189", "192-193", "219-221", "232", "237", "241", "243", "245", "247-248", "251", "253", "255", "259", "261-262",
"148-151",
"268", "276", "280", "283", "285", "287-288", "293-294", "296-297", "299", "302-303", "306", "311", "313", "316", "318"430", "503319", "322-323", "327", "349", "354-356", "360-364", "366-367", "372-374", "378", "380", "384-385", "391", "396-397",
511", "616"402-404", "407", "409", "412-413", "416-418", "421", "424-425", "427", "435", "440-441", "443-444", "447-448", "451",
655", "716"453-455", "462-463", "467-469", "472-476", "480-481", "483", "486-488", "491", "510", "513", "515-516", "518-519",
718", "723",
"522", "527", "530-531", "533-535", "537", "539", "542", "545-547", "549-551", "554", "556-557", "559", "562", "565"1208-1222",
570", "572", "575", "577", "579", "581", "583-586", "592-596", "598-599", "602-604", "606-608", "610-615", "659-660",
"1224-1226",
"664", "666", "672-676", "679-684", "687-692", "695", "697", "703", "705-706", "711-712", "714-715", "717", "728", "734"1768-1770",
738", "740-741", "747-748", "755-756", "758", "760-763", "767", "769-776", "781", "783", "785-787", "797-798", "808"2019",
809", "813", "815-816", "825-826", "829-830", "832", "834-837", "842-844", "846-848", "850-852", "856-875", "877-879",
"2246",
"881-886", "890", "893-894", "896-897", "900", "903-904", "907-908", "912-913", "916-918", "936", "938", "940-942",
"2249-2253",
"945-950", "953", "955", "957-959", "961", "964-969", "971", "983", "985", "987-989", "991-992", "994", "996-1001",
"2255-2260"]
"1004-1008", "1012-1016", "1019-1020", "1023-1024", "1030", "1033", "1035", "1037-1040", "1051", "1053-1057",
"1060", "1063-1069", "1072-1078", "1081-1083", "1085-1086", "1088-1092", "1098-1099", "1104", "1107-1109", "11111112", "1115-1117", "1120", "1140-1141", "1146", "1152-1153", "1161", "1163", "1166-1168", "1170", "1172-1174",
"1177", "1180", "1192", "1207", "1217", "1219", "1221-1222", "1229-1233", "1235", "1237", "1244", "1247-1249", "1251",
"1253-1255", "1257", "1259-1261", "1265", "1267-1268", "1274-1276", "1283", "1286-1287", "1289-1292", "1295-1296",
"1300-1307", "1309", "1313", "1318", "1322", "1324-1325", "1327", "1329", "1338-1339", "1342", "1344", "1346-1348",
"1352", "1354", "1358", "1367", "1369-1374", "1381", "1383", "1404", "1406-1409", "1412", "1433", "1452-1454", "1459",
"1468", "1472", "1474-1479", "1481", "1483-1484", "1489", "1498", "1510", "1512-1513", "1522", "1529", "1531-1532",
"1537-1543", "1548-1549", "1555", "1559", "1562", "1565-1566", "1568", "1574", "1577", "1585", "1589", "1635", "1643",
"1677", "1680", "1684-1687", "1689", "1693", "1781-1783", "1785", "1789", "1792", "1799", "1809-1810", "1819", "18231824", "1833", "1835", "1837-1838", "1854-1856", "1881-1885", "1887", "1890-1893", "1897-1898", "1901", "1905",
"1909", "1911", "1913-1914", "1919", "1931", "1936", "1938", "1941-1942", "1950", "1955", "1980", "1982-1985", "19871988", "1993", "2008", "2062", "2069-2071", "2119", "2123-2124", "2129-2130", "2165", "2189", "2214", "2221-2226",
"2228"]
["11", "14", "17", "20", "23", "25-26", "28", "32-35", "37", "39", "45", "48", "51", "55", "62", "66-68", "71", "75", "79",
799 36.25% ["0-21", "143"90", "94", "111", "116", "123-124", "141", "154", "156", "163", "166", "168", "172", "174", "176-182", "187", "208-213",
155", "604"215", "217", "225-228", "231", "237", "239", "245-249", "255-256", "266-267", "273-274", "279", "281-285", "288", "290633", "686291", "293", "296", "300-301", "305", "308", "310", "312", "320", "340-341", "345-348", "350", "352-354", "356", "358697", "761359", "365-366", "372", "376", "378-379", "383", "385-389", "391-394", "396", "398-399", "403", "405-406", "408", "410",
765", "1008"413", "416-417", "419-421", "423", "427", "430-436", "439-440", "442-444", "446-448", "450", "453-462", "464-467",
1016", "1096"472-474", "476", "478", "480-484", "488", "500", "502-503", "506-508", "512", "516", "519-520", "522", "524-525",
1106", "1175"527", "529", "535-537", "546", "548-549", "556", "558-559", "562-563", "566-567", "569-571", "573-576", "580-581",
1194", "1380"584-592", "594", "596-598", "600-601", "603-605", "630-632", "634", "637", "642", "644", "647", "650-655", "657-664",
1381", "1630"666-669", "671", "673", "675", "678", "682-683", "685", "688-689", "691", "696", "699-700", "704", "706-713", "7191631", "1633720", "723", "726-728", "731-733", "735", "737-741", "743-746", "753", "758-759", "762", "769-770", "773-774", "777",
1634", "1727"780", "785", "787", "797-798", "802", "804-806", "812-816", "819-821", "823", "826-834", "836-846", "848-849", "851",
1733", "1805"854-856", "858", "861-862", "865-866", "868-869", "872-875", "877", "879-881", "884-886", "889-890", "893", "896",
1814", "1850"898", "900", "905", "907-908", "910", "913-914", "917-919", "922", "925-927", "934-938", "940", "944-945", "949", "9571858", "2091961", "966", "969", "971-972", "977-978", "980", "982-988", "992-996", "998-1000", "1002", "1006", "1010-1012", "1015",
2099", "2199"1019", "1027-1029", "1032-1033", "1039-1040", "1043-1046", "1048-1050", "1053", "1055-1056", "1060-1064", "1066",
2203"]
"1069-1071", "1073", "1076", "1079-1081", "1084", "1087-1089", "1092", "1103", "1105", "1112-1115", "1117-1119",
"1121-1123", "1128", "1133", "1136-1138", "1141-1142", "1145", "1150", "1152-1153", "1162", "1166-1167", "1170",
"1174-1175", "1177", "1180", "1184-1187", "1189", "1192", "1197", "1200-1205", "1212-1214", "1218", "1220-1225",
"1228", "1234", "1237", "1242", "1244", "1246", "1248", "1252", "1254", "1256-1257", "1259-1260", "1262-1263", "12651266", "1270", "1272-1273", "1276-1277", "1279-1281", "1284", "1286-1287", "1289", "1293-1294", "1298-1301", "13071308", "1317", "1322-1323", "1327-1328", "1334", "1337", "1339-1340", "1344", "1351-1353", "1355", "1359", "1361",
"1373", "1376-1377", "1383", "1387", "1394", "1404", "1407", "1410-1411", "1415", "1419-1420", "1422", "1424-1426",
"1429", "1431-1434", "1436", "1438", "1441-1444", "1446-1449", "1452", "1454-1455", "1457-1459", "1467-1469", "14711474", "1477-1483", "1489", "1491-1492", "1497", "1499-1502", "1505", "1507", "1510-1513", "1516-1519", "1523",
"1525", "1527", "1529", "1531-1533", "1535-1536", "1539", "1544", "1546-1548", "1551", "1556", "1559", "1562", "1567",
"1569", "1571-1572", "1574", "1577", "1580", "1583-1586", "1588-1589", "1592", "1605", "1612-1614", "1637", "1641",
"1643", "1649", "1651", "1656-1659", "1661", "1668", "1731", "1740-1741", "1745-1746", "1749", "1751", "1753-1754",
"1756", "1758", "1763", "1773", "1776-1777", "1783-1784", "1787-1788", "1792", "1797", "1801-1804", "1807", "1809",
"1812", "1818-1820", "1824-1825", "1845", "1848-1851", "1853", "1855-1858", "1860", "1862", "1867", "1869", "1873",
"1875", "1877-1879", "1887", "1892-1893", "1895", "1900-1901", "1906", "1911", "1919", "1939", "1944", "1946-1949",
"1951-1952", "1955-1956", "1970", "1987", "1991", "1995", "2025-2026", "2029-2030", "2034", "2038", "2048", "2065",
"2069", "2080", "2085", "2089-2091", "2095", "2098-2099", "2131-2132", "2134", "2139", "2154", "2179-2181", "2183",
"2187-2192"]
["8", "10-11", "14", "17", "20", "22-23", "25", "27", "29-32", "34", "36", "42", "44-45", "48", "52", "59", "64", "87", "91", 812 36.19% ["1-4", "37"120-121", "134", "153", "160", "163", "165", "169", "171", "173-179", "184", "209-211", "214", "216", "227-229", "232",
48", "494"238", "240", "246-248", "250-251", "254", "256-257", "267-268", "271", "274-275", "278", "280", "282-283", "286", "288501", "602289", "291-292", "294", "297", "301-302", "306", "309", "311", "313-314", "322", "344", "348-351", "353", "355-359",
616", "636",
"361-362", "368-369", "375", "379", "381-382", "386", "388", "390-397", "399", "401-402", "406", "408-411", "413",
"639", "643"416", "419-420", "422-424", "429-430", "433-439", "442", "445-451", "456-465", "467-470", "475-479", "481", "483-487",
644", "651"491", "495", "499", "501-502", "505", "507", "511", "515", "519", "523-524", "526", "528", "534-536", "545", "547-548",
670", "691",
"551", "555", "557-559", "561-562", "565-566", "568-570", "572-575", "579-580", "583-591", "593-597", "599-604", "712"695-696",
715", "718", "723", "725", "728", "731-736", "738-745", "747-750", "752", "754", "756", "759", "763-764", "766", "769"776-777",
770", "772", "780-781", "785", "787-794", "800-801", "804", "807-809", "811-814", "816", "818-822", "824-827", "834",
"1061",
"839-840", "843", "850-851", "854", "858", "861", "866", "868", "878-879", "883", "885-887", "893-897", "900-902",
"1063-1064",
"904", "907-927", "929-932", "934-937", "939", "942-943", "946-947", "949-950", "953-956", "958", "960-962", "965-967",
"1260-1261",
"969-971", "974", "977", "981", "986", "988-989", "991", "994-995", "999-1000", "1002-1003", "1006-1008", "1015-1019",
"1267-1290",
"1021", "1025-1026", "1030", "1038-1042", "1047", "1050-1054", "1058-1059", "1061", "1063-1069", "1071", "1073"1340-1349",
1077", "1079-1081", "1083", "1087", "1091-1093", "1096", "1100", "1108-1110", "1113", "1120-1121", "1124-1127",
"1459",
"1129-1131", "1134", "1136-1137", "1141-1145", "1147", "1150-1152", "1154", "1157", "1160-1162", "1165", "1168"1466-1470",
1170", "1173", "1182", "1184", "1191-1194", "1196-1198", "1200-1202", "1207", "1212", "1215-1217", "1220", "1223"1472-1477",
1224", "1229", "1231-1232", "1241", "1245-1246", "1249", "1253-1254", "1256", "1259", "1263-1266", "1268", "1271",
"1707-1713",
"1037",
"1221",
"12231227",
"1229",
"1231",
"1286",
"12891290",
"12941295",
"1303",
"1307",
"1466",
"14681469"]
94
4.16%
["510",
"717",
"1217",
"1219",
"12211222"]
6
0.27%
168
7.62%
45
2.04%
148
6.60%
["11",
"14",
"17",
"20",
"154",
"604605",
"630632",
"688689",
"691",
"696",
"762",
"10101012",
"1015",
"1103",
"1105",
"1175",
"1177",
"1180",
"11841187",
"1189",
"1192",
"1731",
"1807",
"1809",
"1812",
"18501851",
"1853",
"18551858",
"2091",
"2095",
"20982099"]
["42",
"4445",
"48",
"495",
"499",
"501",
"602604",
"1061",
"10631064",
"1268",
"1271",
"1274",
"1276",
"12791284",
33
1.47%
144
"1274", "1276", "1279-1284", "1289", "1291-1293", "1297", "1299-1304", "1307", "1313", "1316", "1321", "1323", "1325"1798-1805",
1327", "1331", "1333", "1335-1336", "1338-1339", "1341-1342", "1345", "1349", "1351-1352", "1355-1356", "1358-1360",
"1892-1893",
"1362", "1364-1365", "1367", "1370-1371", "1375-1378", "1380", "1382", "1384", "1386-1387", "1396", "1401-1402",
"2135-2136",
"1406-1407", "1412-1413", "1416", "1418-1419", "1423", "1429-1432", "1434", "1438", "1440", "1455-1456", "1462",
"2204-2213"]
"1466", "1473", "1481", "1486", "1489-1490", "1494", "1497-1499", "1501", "1503-1505", "1508", "1510-1513", "1515",
"1517", "1520-1523", "1525-1529", "1531", "1533-1534", "1536-1538", "1545-1547", "1550", "1552-1553", "1556-1562",
"1568", "1570-1571", "1576", "1578-1581", "1584", "1586", "1589-1592", "1595-1598", "1600", "1602", "1604", "1606",
"1608", "1610-1612", "1614-1615", "1617-1618", "1623", "1625-1627", "1630", "1634-1635", "1638", "1641", "1646",
"1648", "1650-1651", "1653", "1656", "1659", "1663-1665", "1667-1669", "1671", "1684", "1691-1693", "1714", "1718",
"1720", "1726", "1728", "1730", "1732-1735", "1738", "1744", "1805-1806", "1810-1811", "1814", "1816", "1818-1819",
"1821", "1823", "1828", "1838", "1841-1842", "1848-1849", "1852-1853", "1862", "1866-1868", "1872", "1874", "1877",
"1883-1885", "1887-1888", "1902", "1904", "1906-1910", "1912", "1914-1917", "1919", "1921", "1923", "1926", "1928",
"1932", "1934", "1936-1938", "1946", "1951", "1954", "1959-1960", "1965", "1970", "1978", "1998", "2003", "2005-2008",
"2010-2011", "2014-2015", "2028", "2031", "2050", "2054", "2084-2085", "2088", "2093", "2097", "2107", "2112", "2121",
"2126", "2140", "2144-2146", "2150", "2153-2154", "2186-2187", "2189", "2191", "2194", "2205-2206", "2227", "2229",
"2231", "2233", "2235-2240", "2242"]
PDPRV
["9-12", "18", "20", "26", "28", "30-31", "43", "63", "86-87", "90", "120", "155", "158", "160", "164", "168", "171", "174- 649 29.73% ["0-3", "5",
175", "178-179", "207-209", "220", "225", "229", "231", "233", "235-236", "239", "241", "243", "247", "249-250", "256",
"8", "592"264", "268", "271", "273", "275-276", "281-282", "284-285", "287", "290-292", "294", "299", "301", "306-307", "310635", "784311", "315", "337", "342-344", "348-352", "354-355", "360-362", "366", "368", "372-373", "379", "384-385", "390-392",
789", "1032"395", "397", "400-401", "404-406", "409", "412-413", "415", "422-423", "428-429", "431-432", "435-436", "439", "4411035", "1211443", "449-451", "455-457", "460-464", "468-469", "471", "474-476", "479", "492", "495", "497-498", "500-501", "504",
1217", "1230"509", "512-513", "515-517", "519", "521", "524", "527-529", "531-533", "536", "538-539", "541", "544", "547-552",
1233", "1282"554", "557", "559", "561", "563", "565-568", "574-578", "580-581", "584-586", "588-590", "592-597", "653-654", "658",
1284", "1294"660", "666-670", "673-678", "681-686", "689", "691", "697", "699-700", "705-706", "708-709", "711", "722", "728-732",
1296", "1647",
"734-735", "741-742", "749-750", "752", "754-757", "761", "763-770", "775", "777", "779-781", "791-792", "802-803",
"1819-1820",
"807", "809-810", "819-820", "823-824", "826", "828-831", "834", "836-838", "840-842", "844-846", "850-869", "871-873",
"1825",
"875-880", "884", "887-888", "890-891", "894", "897-898", "901-902", "906-907", "910", "912", "930", "932", "934-936",
"2182"]
"939-944", "947", "949", "951-953", "955", "958-962", "965", "977", "979", "981", "983", "985-986", "988", "991-995",
"998-1002", "1006-1010", "1013-1014", "1017-1018", "1024", "1027", "1029", "1031-1034", "1045", "1047-1051", "10541055", "1057-1063", "1066-1072", "1075-1077", "1079-1080", "1082-1086", "1092-1093", "1098", "1101-1103", "11051106", "1109-1111", "1114", "1130-1131", "1136", "1142-1143", "1151", "1153", "1156-1158", "1160", "1162-1164",
"1167", "1170", "1182", "1197", "1207-1209", "1211-1212", "1217", "1219-1223", "1225", "1227", "1234", "1237-1239",
"1241", "1243-1245", "1247", "1249-1251", "1255", "1257-1258", "1264-1266", "1273", "1276-1277", "1279-1282", "12851286", "1290-1297", "1299", "1303", "1308", "1312", "1314-1315", "1317", "1319", "1328-1329", "1332", "1334", "13361338", "1342", "1344", "1348", "1357", "1359-1364", "1371", "1373", "1394", "1396-1399", "1402", "1421", "1442-1444",
"1449", "1458", "1462", "1464-1469", "1471", "1473-1474", "1479", "1487", "1500", "1502-1503", "1512", "1519", "15211522", "1527-1533", "1538-1539", "1545", "1549", "1552", "1555-1556", "1558", "1564", "1567", "1575", "1579", "1627",
"1635", "1640", "1663", "1666-1667", "1670-1674", "1678", "1755-1757", "1759", "1763", "1766", "1773", "1783-1784",
"1793", "1797-1798", "1807", "1809", "1811-1812", "1828-1830", "1845-1849", "1851", "1854-1857", "1861-1862",
"1865", "1869", "1873", "1875", "1877-1878", "1883", "1895", "1900", "1902", "1906", "1914", "1919", "1944", "19461949", "1951-1952", "1957", "1972", "2024", "2031-2032", "2036", "2079", "2083-2084", "2089-2090", "2123", "2131",
"2143", "2166", "2172-2177"]
PDV
["7", "9-10", "13", "16", "19", "21", "24", "26", "28-31", "33", "35", "41", "43-44", "47", "51", "58", "63", "86", "90", "119- 810 37.09% ["1-3", "8",
120", "148", "155", "158", "160", "164", "166", "168-174", "179", "204-206", "209", "211", "220-222", "225", "231", "233",
"490-492",
"239-241", "243-244", "247", "249-250", "260-261", "264", "267-268", "273", "275-277", "279", "281-282", "284-285",
"594-648",
"287", "290", "294-295", "299", "302", "304", "306-307", "315", "337", "341-344", "346", "348-352", "354-355", "361"1033",
362", "368", "372", "374-375", "379", "381", "383-390", "392", "394-395", "399", "401-404", "406", "409", "412-413",
"1230-1233",
"415-417", "423", "426-432", "435-436", "438-444", "449-458", "460-463", "468-472", "474", "476-480", "484", "488",
"1281-1284",
"492", "494-495", "498-500", "504", "508", "512", "516-517", "519", "521", "527-529", "538", "540-541", "544", "548",
"1713-1714",
"550-552", "554-555", "558-559", "561-563", "565-568", "572-573", "576-584", "586-590", "592-597", "653-654", "656",
"1736",
"659", "664", "666", "669", "672-677", "679-686", "688-691", "693", "695", "697", "700", "704-705", "707", "711", "713",
"1814-1823",
"721-722", "726", "728-735", "741-742", "745", "748-750", "752-755", "757", "759-763", "765-768", "775", "780-781",
"2051-2053",
"784", "791-792", "795", "799", "802", "807", "809", "819-820", "824", "826-828", "834-838", "841-843", "845", "848",
"2183"]
"850-868", "870-873", "875-878", "880", "883-884", "887-888", "890-891", "894-897", "899", "901-903", "906-908", "910912", "915", "918", "922", "927", "929-930", "932", "935-936", "939-941", "943-944", "947-949", "956-960", "962", "966967", "971", "979-983", "988", "991-995", "999-1000", "1002", "1004-1010", "1014-1018", "1020-1022", "1024", "1028",
"1032-1034", "1037", "1041", "1049-1051", "1054", "1061-1062", "1065-1068", "1070-1072", "1075", "1077-1078", "10821086", "1088", "1091-1093", "1095", "1098", "1101-1103", "1106", "1109-1110", "1114", "1123", "1125", "1132-1135",
"1137-1139", "1141-1143", "1148", "1153", "1156-1158", "1161-1162", "1164-1165", "1170", "1172-1173", "1182", "11861187", "1190", "1194-1195", "1197-1198", "1200", "1204-1207", "1209", "1212", "1215", "1217", "1220-1225", "1230",
"1232-1234", "1238", "1240-1245", "1248", "1254", "1257", "1262", "1264", "1266-1268", "1272", "1274", "1276-1277",
"1279-1280", "1282-1283", "1285-1286", "1290", "1292-1293", "1296-1297", "1299-1301", "1303", "1305-1306", "1308",
"1311-1312", "1316-1319", "1323", "1325", "1327-1328", "1337", "1342-1343", "1347-1348", "1353-1354", "1357", "13591360", "1364", "1370-1373", "1375", "1379", "1381", "1396-1397", "1403", "1407", "1414", "1422", "1427", "1430-1431",
"1435", "1438-1440", "1442", "1444-1446", "1449", "1451-1454", "1458", "1461-1464", "1466-1470", "1472", "14741475", "1477-1479", "1487", "1491", "1493-1494", "1497-1503", "1509", "1511-1512", "1517", "1519-1522", "1525",
"1527", "1530-1533", "1536-1539", "1541", "1543", "1545", "1547", "1549", "1551-1553", "1555-1556", "1558-1559",
"1564", "1566-1568", "1571", "1575-1576", "1579", "1582", "1589", "1591", "1593-1594", "1596", "1599", "1602", "16051608", "1610-1612", "1614", "1627", "1634-1636", "1657", "1661", "1663", "1669", "1671", "1673", "1675-1678", "1681",
"1688", "1739", "1750-1751", "1755-1756", "1759", "1761", "1763-1764", "1766", "1768", "1773", "1781", "1783", "17861787", "1793-1794", "1797-1798", "1807", "1811-1814", "1817", "1819", "1822", "1828-1830", "1832-1833", "1843",
"1845", "1847-1851", "1853", "1855-1858", "1862", "1864", "1867", "1869", "1873", "1875", "1877-1879", "1887", "1892",
"1895", "1898", "1900-1901", "1906", "1911", "1919", "1939", "1944", "1946-1949", "1951-1952", "1955-1956", "1969",
"1972", "1991", "1995", "2023-2024", "2027", "2032", "2036", "2046", "2051", "2060", "2065", "2079", "2083-2085",
"2089", "2092-2093", "2123-2124", "2126", "2128", "2131", "2142-2143", "2164", "2166", "2168", "2170", "2172-2177",
"2179"]
PNVM15
[]
0
0.00% ["1-7", "126132", "624632", "701703", "757772", "1000",
"1026",
"1183-1213",
"1277",
"1686-1687",
"1957-1958",
"2038-2039"]
PNVMJ366
[]
0
0.00% ["1-7", "1266
132", "764766", "11891204", "12081210"]
RPV
["9-12", "18", "20", "26", "28", "30-31", "43", "63", "86-87", "90", "120", "155", "158", "160", "164", "168", "171", "174- 646 29.59% ["1-3", "5",
175", "178-179", "207-209", "220", "225", "229", "231", "233", "235-236", "239", "241", "243", "247", "249-250", "256",
"8", "13-14",
"264", "268", "271", "273", "275-276", "281-282", "284-285", "287", "290-291", "294", "299", "301", "306-307", "310"183-190",
311", "315", "337", "342-344", "348-352", "354-355", "360-362", "366", "368", "372-373", "379", "384-385", "390-392",
"489-492",
"395", "397", "400-401", "404-406", "409", "412-413", "415", "422-423", "428-429", "431-432", "435-436", "439", "441"494", "597443", "449-451", "455-457", "460-464", "468-469", "471", "474-476", "479", "492", "495", "497-498", "500-501", "504",
647", "1032"509", "512-513", "515-516", "519", "521", "524", "527-529", "531-533", "536", "538-539", "541", "544", "547-552",
1035", "1086"554", "557", "559", "561", "563", "565-568", "574-578", "580-581", "584-586", "588-590", "592-597", "653-654", "658",
1092", "1214"660", "666-670", "673-678", "681-682", "684-686", "689", "691", "697", "699-700", "705-706", "708-709", "711", "722",
1218", "1230"728-732", "734-735", "741-742", "749-750", "752", "754-757", "761", "763-770", "775", "777", "779-781", "791-792",
1232", "1278"802-803", "807", "809-810", "819-820", "823-824", "826", "828-831", "834", "836-838", "840-842", "844-846", "850-869",
1284", "1294"871-873", "875-880", "884", "887-888", "890-891", "894", "897-898", "901-902", "906-907", "910", "912", "930", "932",
1296", "1702"934-936", "939-944", "947", "949", "951-953", "955", "958-962", "965", "977", "979", "981", "983", "985-986", "988",
1709", "1776"991-995", "998-1002", "1006-1010", "1013-1014", "1017-1018", "1024", "1027", "1029", "1031-1034", "1045", "10471778", "2154-
"1289",
"13411342",
"1345",
"1349",
"1466",
"1473",
"1805",
"22052206"]
82
3.76%
["592597",
"10321034",
"12111212",
"1217",
"1282",
"12941296"]
16
0.73%
88
4.03%
["492",
"594597",
"1033",
"1230",
"12321233",
"12821283",
"1814",
"1817",
"1819",
"1822",
"2051"]
16
0.73%
82
4.02%
[]
0
0.00%
36
1.76%
[]
0
0.00%
116
5.31%
["492",
"597",
"10321034",
"1086",
"1092",
"1217",
"12791282",
"12941296"]
15
0.69%
145
RSV
SENV
SPIV41
SPIV5
TIOV
1051", "1054-1055", "1057-1063", "1066-1072", "1075-1077", "1079-1080", "1082-1086", "1092-1093", "1098", "11021103", "1105-1106", "1109-1111", "1114", "1130-1131", "1136", "1142-1143", "1151", "1153", "1156-1158", "1160",
"1162-1164", "1167", "1170", "1182", "1197", "1207-1209", "1211-1212", "1217", "1219-1223", "1225", "1227", "1234",
"1237-1239", "1241", "1243-1245", "1247", "1249-1251", "1255", "1257-1258", "1264-1266", "1273", "1276-1277", "12791282", "1285-1286", "1290-1297", "1299", "1303", "1308", "1312", "1314-1315", "1317", "1319", "1328-1329", "1332",
"1334", "1336-1338", "1342", "1344", "1348", "1357", "1359-1364", "1371", "1373", "1394", "1396-1399", "1402", "1407",
"1421", "1442-1444", "1449", "1458", "1462", "1464-1469", "1471", "1473-1474", "1479", "1487", "1500", "1502-1503",
"1512", "1519", "1521-1522", "1527-1533", "1538-1540", "1545", "1549", "1552", "1555-1556", "1558", "1564", "1567",
"1575", "1579", "1627", "1635", "1640", "1663", "1666", "1670-1674", "1678", "1755-1757", "1759", "1763", "1766",
"1773", "1783-1784", "1793", "1797-1798", "1807", "1809", "1811-1812", "1828-1830", "1845-1849", "1851", "18541857", "1861-1862", "1865", "1869", "1873", "1875", "1877-1878", "1883", "1895", "1900", "1902", "1906", "1914",
"1919", "1944", "1946-1949", "1951-1952", "1957", "1972", "2024", "2031-2032", "2036", "2079", "2083-2084", "20892090", "2123", "2131", "2143", "2166", "2173-2178"]
[]
2158"]
0
0.00%
["1-4", "7-8",
"135-149",
"172-183",
"1249-1276",
"1716",
"1761",
"2160",
"2162-2164"]
650 29.17% ["0-13", "197199", "258263", "486489", "599652", "1032",
"1219-1220",
"1377",
"1379-1380",
"1382-1387",
"1621-1627",
"2029-2036",
"2211-2227"]
["13-16", "22", "24", "30", "32", "34-35", "47", "64", "90-91", "94", "124", "163", "166", "168", "172", "176", "179", "182183", "186-187", "215-217", "228", "233", "237", "239", "241", "243-244", "247", "249", "251", "255", "257-258", "264",
"272", "276", "279", "281", "283-284", "289-290", "292-293", "295", "298-300", "302", "307", "309", "312", "314-315",
"318-319", "323", "345", "350-352", "356-360", "362-363", "368-370", "374", "376", "380-381", "387", "392-393", "398400", "403", "405", "408-409", "412-414", "417", "420-421", "423", "430-431", "436-437", "439-440", "443-444", "447",
"449-451", "458-459", "463-465", "468-472", "476-477", "479", "482-484", "487", "500", "503", "505-506", "508-509",
"512", "517", "520-521", "523-524", "527", "529", "532", "535-537", "539-541", "544", "546-547", "549", "552", "555560", "562", "565", "567", "569", "571", "573-576", "582-586", "588-589", "592-594", "596-598", "600-605", "653-654",
"658", "660", "666-670", "673-678", "681-682", "684-686", "689", "691", "697", "699-700", "705-706", "708-709", "711",
"722", "728-732", "734-735", "741-742", "749-750", "752", "754-757", "761", "763-770", "775", "777", "779-781", "791792", "802-803", "807", "809-810", "819-820", "823-824", "826", "828-831", "836-838", "840-842", "844-846", "850-869",
"871-873", "875-880", "884", "887-888", "890-891", "894", "897-898", "901-902", "906-907", "910-912", "930", "932",
"934-936", "939-944", "947", "949", "951-953", "955", "958-963", "965", "977", "979", "981-983", "985-986", "988", "991995", "998-1002", "1006-1010", "1013-1014", "1017-1018", "1024", "1027", "1029", "1031-1034", "1045", "1047-1051",
"1054", "1057-1063", "1066-1072", "1075-1077", "1079-1080", "1082-1086", "1092-1093", "1098", "1101-1103", "11051106", "1109-1111", "1114", "1130-1131", "1136", "1142-1143", "1151", "1153", "1156-1158", "1160", "1162-1164",
"1167", "1170", "1199", "1209-1211", "1213-1214", "1221-1225", "1227", "1229", "1236", "1239-1241", "1243", "12451247", "1249", "1251-1253", "1257", "1259-1260", "1266-1268", "1275", "1278-1279", "1281-1284", "1287-1288", "12921299", "1301", "1305", "1310", "1314", "1316-1317", "1319", "1321", "1330-1331", "1334", "1336", "1338-1340", "1344",
"1346", "1350", "1359", "1361-1366", "1373", "1375", "1396", "1398-1401", "1404", "1423", "1444-1446", "1451", "1460",
"1464", "1466-1473", "1475-1476", "1481", "1489", "1502", "1504-1505", "1514", "1521", "1523-1524", "1529-1535",
"1540-1541", "1547", "1551", "1554", "1557-1558", "1560", "1566", "1569", "1577", "1581", "1627", "1635", "1640",
"1663", "1666", "1670-1674", "1678", "1771-1773", "1775", "1779", "1782", "1789", "1799-1800", "1802", "1809", "18131814", "1823", "1825", "1827-1828", "1844-1846", "1863", "1865-1869", "1871", "1874-1877", "1881-1882", "1885",
"1889", "1893", "1895", "1897-1898", "1903", "1915", "1920", "1922", "1925-1926", "1934", "1939", "1964", "1966-1969",
"1971-1972", "1977", "1992", "2040", "2047-2048", "2052", "2101", "2105-2106", "2111-2112", "2147", "2155", "2167",
"2186", "2193-2198", "2200"]
["7-10", "16", "18", "24", "26", "28-29", "41", "61-62", "90-91", "94", "124", "169", "172", "174", "178", "182", "185",
655 28.87% ["1-7", "35"188-189", "192-193", "219-221", "232", "237", "241", "243", "245", "247-248", "251", "253", "255", "259", "261-262",
38", "145"268", "276", "280", "283", "285", "287-288", "293-294", "296-297", "299", "302-304", "306", "311", "313", "316", "318158", "502319", "322-323", "327", "349", "354-356", "360-364", "366-367", "372-374", "378", "380", "384-385", "391", "396-397",
510", "634"402-404", "407", "409", "412-413", "416-418", "421", "424-425", "427", "434-435", "440-441", "443-444", "447-448",
660", "1169"451-456", "461-463", "467-469", "472-476", "480-481", "483", "486-488", "491", "510", "513", "515-516", "518-519",
1171", "1212"522", "527", "530-531", "533-534", "537", "539", "542", "545-547", "549-551", "554", "556-557", "559", "562", "5651226", "1229570", "572", "575", "577", "579", "581", "583-586", "592-596", "598-599", "602-604", "606-608", "610-615", "663-664",
1230", "1290"668", "670", "676-680", "683-688", "691-692", "694-696", "699", "701", "707", "709-710", "715-716", "718-719", "721",
1297", "1368",
"732", "738-742", "744-745", "751-752", "759-760", "762", "764-767", "771", "773-780", "785", "787", "789-791", "801"1760-1761",
802", "812-813", "817", "819-820", "829-830", "833-834", "836", "838-841", "844", "846-848", "850-852", "854-856",
"1768-1769",
"860-879", "881-883", "885-890", "894", "897-898", "900-901", "904", "907-908", "911-912", "916-917", "920-922", "940",
"1772-1773",
"942", "944-946", "949-954", "957", "959", "961-963", "965", "968-973", "975", "987", "989", "991-993", "995-996",
"1843-1847",
"998", "1000-1005", "1008-1012", "1016-1020", "1023-1024", "1027-1028", "1034", "1037", "1039", "1041-1044", "1055",
"1849-1850",
"1057-1061", "1064-1065", "1067-1073", "1076-1082", "1085-1087", "1089-1090", "1092-1096", "1098", "1102-1103",
"1854",
"1108", "1111-1113", "1115-1116", "1119-1121", "1124", "1144-1145", "1150", "1156-1157", "1165", "1167", "1170"2258",
1172", "1174", "1176-1178", "1181", "1184", "1196", "1211", "1221-1223", "1225-1226", "1231", "1233-1237", "1239",
"2261-2268"]
"1241", "1248", "1252-1253", "1255", "1257-1259", "1261", "1263-1265", "1269", "1271-1272", "1278-1280", "1287",
"1290-1291", "1293-1296", "1299-1300", "1304-1311", "1313", "1317", "1322", "1326", "1328-1329", "1333", "13421343", "1346", "1348", "1350-1352", "1356", "1358", "1362", "1371", "1373-1378", "1385", "1387", "1408", "1410-1413",
"1416", "1421", "1437", "1456-1458", "1463", "1472", "1476", "1478-1483", "1485", "1487-1488", "1493", "1502", "1514",
"1516-1517", "1526", "1533", "1535-1536", "1541-1547", "1552-1553", "1559", "1563", "1566", "1569-1570", "1572",
"1578", "1581", "1589", "1593", "1639", "1647", "1681", "1684", "1688-1691", "1693", "1697", "1785-1787", "1789",
"1793", "1796", "1803", "1811", "1813-1814", "1823", "1827-1828", "1837", "1839", "1841-1842", "1858-1860", "18851889", "1891", "1894-1897", "1901-1902", "1905", "1909", "1913", "1915", "1917-1918", "1923", "1935", "1940", "1942",
"1945-1946", "1954", "1959", "1984", "1986-1989", "1991-1992", "1997", "2012", "2068", "2075-2077", "2125", "21292130", "2135-2136", "2171", "2195", "2220", "2227-2232", "2234"]
["7-10", "16", "18", "24", "26", "28-29", "41", "61", "90-91", "94", "124", "167", "170", "172", "176", "180", "183", "186- 657 29.14% ["1-6", "10",
187", "190-191", "217-219", "230", "235", "239", "241", "243", "245-246", "249", "251", "253", "257", "259-260", "266",
"151-156",
"274", "278", "281", "283", "285-286", "291-292", "294-295", "297", "300-302", "304", "309", "311", "314", "316-317",
"428", "500"320-321", "325", "347", "352-354", "358-362", "364-365", "370-372", "376", "378", "382-383", "389", "394-395", "400509", "617402", "405", "407", "410-411", "414-416", "419", "422-423", "425", "432-433", "438-439", "441-442", "445-446", "449",
651", "785"451-453", "459-461", "465-467", "470-474", "478-479", "481", "484-486", "489", "508", "511", "513-514", "516-517",
790", "1201"520", "525", "528-529", "531-532", "535", "537", "540", "543-545", "547-549", "552", "554-555", "557", "560", "5631215", "1218568", "570", "573", "575", "577", "579", "581-584", "590-594", "596-597", "600-602", "604-606", "608-613", "653-654",
1221", "1279"658", "660", "666-670", "673-678", "681-682", "684-686", "689", "691", "697", "699-700", "705-706", "708-709", "711",
1286", "1288",
"722", "728-732", "734-735", "741-742", "749-750", "752", "754-757", "761", "763-770", "775", "777", "779-781", "791"1748-1754",
792", "802-803", "807", "809-810", "819-820", "823-824", "826", "828-831", "834", "836-838", "840-842", "844-846",
"1759",
"850-869", "871-873", "875-880", "884", "887-888", "890-891", "894", "897-898", "901-902", "906-907", "910-912", "930",
"1761",
"932", "934-936", "939-944", "947", "949", "951-953", "955", "958-963", "965", "977", "979", "981-983", "985-986",
"1839-1840",
"988", "990-995", "998-1002", "1006-1010", "1013-1014", "1017-1018", "1024", "1027", "1029", "1031-1034", "1045",
"1843-1846",
"1047-1051", "1054-1055", "1057-1063", "1066-1072", "1075-1077", "1079-1080", "1082-1086", "1088", "1092-1093",
"2245-2254"]
"1098", "1101-1103", "1105-1106", "1109-1111", "1114", "1134-1135", "1140", "1146-1147", "1155", "1157", "11601162", "1164", "1166-1168", "1171", "1174", "1186", "1201", "1211-1213", "1215-1216", "1221", "1223-1227", "1229",
"1231", "1238", "1241-1243", "1245", "1247-1249", "1251", "1253-1255", "1259", "1261-1262", "1268-1270", "1277",
"1280-1281", "1283-1286", "1289-1290", "1294-1301", "1303", "1307", "1312", "1316", "1318-1319", "1321", "1323",
"1329", "1332-1333", "1336", "1338", "1340-1342", "1346", "1348", "1352", "1361", "1363-1368", "1375", "1377", "1398",
"1400-1403", "1406", "1411", "1427", "1446-1448", "1453", "1462", "1466", "1468-1473", "1475", "1477-1478", "1483",
"1492", "1504", "1506-1507", "1516", "1523", "1525-1526", "1531-1537", "1542-1544", "1549", "1553", "1556", "15591560", "1562", "1568", "1571", "1579", "1583", "1629", "1637", "1671", "1674", "1678-1681", "1683", "1687", "17751777", "1779", "1783", "1786", "1793", "1803-1804", "1806", "1813", "1817-1818", "1827", "1829", "1831-1832", "1834",
"1848-1850", "1875-1879", "1881", "1884-1887", "1891-1892", "1895", "1899", "1903", "1905", "1907-1908", "1913",
"1925", "1930", "1932", "1935-1936", "1944", "1949", "1974", "1976-1979", "1981-1982", "1987", "2002", "2056", "20632065", "2113", "2117-2118", "2123-2124", "2159", "2183", "2208", "2215-2220", "2222"]
["7-10", "16", "18", "24", "26", "28-29", "41", "61", "65", "94-95", "98", "128", "175", "178", "180", "184", "188", "191", 659 29.02% ["1-4", "7",
"194-195", "198-199", "225-227", "238", "243", "247", "249", "251", "253-254", "257", "259", "261", "265", "267-268",
"74-85", "161"274", "282", "286", "289", "291", "293-294", "299-300", "302-303", "305", "308-310", "312", "317", "319", "322", "324167", "623325", "328-329", "333", "355", "360-362", "366-370", "372-373", "378-380", "384", "386", "390-391", "397", "402-403",
660", "795"408-410", "413", "415", "418-419", "422-424", "427", "430-431", "433", "440-441", "446-447", "449-450", "453-454",
796", "798-
67
3.09%
[]
0
0.00%
125
5.61%
["13",
"258",
"487",
"600605",
"1032",
"1627"]
11
0.49%
113
4.98%
["7",
"510",
"11701171",
"12211223",
"12251226",
"12901291",
"12931296"]
15
0.66%
118
5.23%
["10",
"508",
"1201",
"12111213",
"1215",
"1221",
"12801281",
"12831286"]
14
0.62%
136
5.99%
["7",
"803804",
"1169",
"1171",
13
0.57%
146
TUPV
"457", "459-461", "467-469", "473-475", "478-482", "486-487", "489", "492-494", "497", "516", "519", "521-522", "524805", "1169525", "528", "533", "536-537", "539-540", "543", "545", "548", "551-553", "555-557", "560", "562-563", "565", "568",
1177", "1217"571-576", "578", "581", "583", "585", "587", "589-592", "598-602", "604-605", "608-610", "612-614", "616-621", "6651229", "1233666", "670", "672", "678-682", "685-690", "693-694", "696-698", "701", "703", "709", "711-712", "717-718", "720-721",
1236", "1735"723", "734", "740-744", "746-747", "753-754", "761-762", "764", "766-769", "773", "775-782", "787", "789", "791-793",
1737", "1768"803-804", "814-815", "819", "821-822", "831-832", "835-836", "838", "840-843", "846", "848-850", "852-854", "856-858",
1775", "1851",
"862-881", "883-885", "887-892", "896", "899-900", "902-903", "906", "909-910", "913-914", "918-919", "922-924", "942",
"1855-1861",
"944", "946-948", "951-956", "959", "961", "963-965", "967", "970-975", "977", "984", "989", "991", "993-995", "997"2020-2028",
998", "1000", "1002-1007", "1010-1014", "1018-1022", "1025-1026", "1029-1030", "1036", "1039", "1041", "1043-1046",
"2259-2260",
"1057", "1059-1063", "1066-1067", "1069-1075", "1078-1084", "1087-1089", "1091-1092", "1094-1098", "1104-1105",
"2262-2269"]
"1110", "1113-1115", "1117-1118", "1121-1123", "1126", "1148-1149", "1154", "1160-1161", "1169", "1171", "11741176", "1178", "1180-1182", "1185", "1188", "1200", "1215", "1225-1227", "1229-1230", "1235", "1237-1241", "1243",
"1245", "1252", "1255-1257", "1259", "1261-1263", "1265", "1267-1269", "1273", "1275-1276", "1282-1284", "1291",
"1294-1295", "1297-1300", "1303-1304", "1308-1315", "1317", "1321", "1326", "1330", "1332-1333", "1335", "1337",
"1346-1347", "1350", "1352", "1354-1356", "1360", "1362", "1366", "1375", "1377-1382", "1389", "1391", "1412", "14141417", "1420", "1425", "1441", "1460-1462", "1467", "1476", "1480", "1482-1487", "1489", "1491-1492", "1497", "1506",
"1518", "1520-1521", "1530", "1537", "1539-1540", "1545-1551", "1556-1558", "1563", "1567", "1570", "1573-1574",
"1576", "1582", "1585", "1593", "1597", "1643", "1651", "1656", "1685", "1688-1689", "1692-1695", "1697", "1701",
"1791-1793", "1795", "1799", "1802", "1809", "1817", "1819-1820", "1829", "1833-1834", "1843", "1845", "1847-1848",
"1850", "1864-1866", "1891-1895", "1897", "1900-1903", "1907-1908", "1911", "1915", "1919", "1921", "1923-1924",
"1929", "1941", "1946", "1948", "1951-1952", "1960", "1965", "1990", "1992-1995", "1997-1998", "2003", "2018", "2072",
"2079-2080", "2084", "2129", "2133-2134", "2139-2140", "2175", "2199", "2224", "2231-2236", "2238"]
["11-14", "20", "22", "28", "30", "32-33", "45", "65", "69", "88-89", "92", "122", "161", "164", "166", "170", "174", "177", 662 29.16% ["1-7", "59",
"180-181", "184-185", "213-215", "228", "233", "237", "239", "241", "243-244", "247", "249", "251", "255", "257-258",
"140-154",
"264", "272", "276", "279", "281", "283-284", "289-290", "292-293", "295", "298-300", "302", "307", "309", "312", "314"603", "606315", "318-319", "323", "345", "350-352", "356-360", "362-363", "368-370", "374", "376", "380-381", "387", "392-393",
651", "654"398-400", "403", "405", "408-409", "412-414", "417", "420-421", "423", "430-431", "436-437", "439-440", "443-444",
674", "708"447-451", "457-459", "463-465", "468-472", "476-477", "479", "482-484", "487", "500", "503", "505-506", "508-509",
710", "712"512", "517", "520-521", "523-524", "527", "529", "532", "535-537", "539-541", "544", "546-547", "549", "552", "555713", "715560", "562", "565", "567", "569", "571", "573-576", "582-586", "588-589", "592-594", "596-598", "600-605", "720", "723718", "1133724", "728", "730", "736-740", "743-748", "751-752", "754-756", "759", "761", "767", "769-770", "775-776", "778-779",
1135", "1159"781", "792", "798-802", "804-805", "811-812", "819-820", "822", "824-827", "831", "833-840", "845", "847", "849-851",
1169", "1273",
"861-862", "872-873", "877", "879-880", "889-890", "893-894", "896", "898-901", "904", "906-908", "910-912", "914-916",
"1278-1280",
"918", "920-939", "941-943", "945-950", "954", "957-958", "960-961", "964", "967-968", "971-972", "976-977", "980-982",
"1288-1290",
"999-1000", "1002", "1004-1006", "1009-1014", "1017", "1019", "1021-1023", "1025", "1028-1033", "1035", "1047",
"1294-1299",
"1049", "1051-1053", "1055-1056", "1058", "1060-1065", "1068-1072", "1076-1080", "1083-1084", "1087-1088", "1094",
"1347-1354",
"1097", "1099", "1101-1104", "1115", "1117-1121", "1124-1125", "1127-1133", "1136-1142", "1145-1147", "1149-1150",
"1780",
"1152-1156", "1162-1163", "1168", "1171-1173", "1175-1176", "1179-1181", "1184", "1200-1201", "1206", "1212-1213",
"2094",
"1221", "1223", "1226-1228", "1230", "1232-1234", "1237", "1240", "1252", "1267", "1277-1279", "1281-1282", "1287",
"2097",
"1289-1293", "1295", "1297", "1304", "1307-1309", "1311", "1313-1315", "1317", "1319-1321", "1325", "1327-1328",
"2099-2100",
"1334-1336", "1343", "1346-1347", "1349-1352", "1355-1356", "1360-1367", "1369", "1373", "1378", "1382", "1384"2160-2161",
1385", "1389", "1398-1399", "1402", "1404", "1406-1408", "1412", "1414", "1418", "1427", "1429-1434", "1441", "1443",
"2163-2174"]
"1464", "1466-1469", "1472", "1477", "1491", "1512-1514", "1519", "1528", "1532", "1534-1539", "1541", "1543-1544",
"1549", "1557", "1570", "1572-1573", "1582", "1589", "1591-1592", "1597-1603", "1608-1610", "1615", "1619", "1622",
"1625-1626", "1628", "1634", "1637", "1645", "1649", "1697", "1705", "1710", "1733", "1736", "1740-1744", "1748",
"1763", "1835-1837", "1839", "1843", "1846", "1853", "1863-1864", "1866", "1873", "1877-1878", "1887", "1889", "18911892", "1894", "1908-1910", "1927", "1929-1933", "1935", "1938-1941", "1945-1946", "1949", "1953", "1957", "1959",
"1961-1962", "1967", "1979", "1984", "1986", "1989-1990", "1998", "2003", "2028", "2030-2033", "2035-2036", "2041",
"2056", "2110", "2117-2118", "2122", "2165", "2169-2170", "2175-2176", "2211", "2219", "2231", "2254", "2260-2265"]
"11741176",
"12251227",
"1229",
"1235"]
154
6.78%
["603",
"1133",
"11621163",
"1168",
"12781279",
"12891290",
"1295",
"1297",
"1347",
"13491352",
"2165",
"21692170"]
19
0.84%
B. Rhabdoviridae L Polymerase
Name
ABLV
BEFV
CICP Position
CICP CICP %
Disorder Position
Disorer Disorder
Both
Both Both %
#
#
%
Position
#
["9", "12", "28", "32", "44-46", "60", "71", "96", "110", "112", "117", "120", "127", "129", "134",
460 21.62% ["0-24", "206-208", "477", "479- 102
4.79% ["9", "12", 13
0.61%
"143", "156", "172-174", "177", "186", "195", "198-199", "214", "218", "236", "238", "246", "249",
480", "514-518", "716-717",
"479",
"251-252", "255-256", "259-260", "267", "269", "276", "301", "305-308", "312", "314", "319", "321",
"719-728", "1070-1073", "1583"515"323-325", "329-330", "336", "345", "352", "358-359", "361-363", "366-368", "375", "386", "390",
1588", "1612-1627", "1645518",
"400", "411", "414", "418", "420", "424-425", "435", "438-439", "442", "446", "448", "450-453",
1656", "1746-1748", "2092"721"466", "470-471", "478-479", "485-487", "490", "498", "502", "504-506", "508-513", "515-518",
2095", "2118-2121", "2123722",
"524", "526-527", "532", "560", "570", "572-574", "579", "586", "588-589", "596-599", "613", "615",
2127"]
"1072",
"631", "638-640", "642", "645", "647", "664", "669", "676", "680", "682", "704-706", "710", "715",
"1613",
"721-722", "732-734", "736", "738-739", "747", "752", "755", "759", "762-763", "778", "784", "786",
"1617",
"788", "794", "796", "800", "802-804", "813-814", "821-822", "824-827", "829", "838-841", "843"1619"]
845", "847-848", "851", "854-855", "857-858", "860", "866", "868", "870", "872-873", "884", "888",
"891", "902", "915", "919", "928", "937-938", "940", "950", "952", "956-957", "959", "963", "970971", "973", "976-977", "979", "982-985", "988", "998-1001", "1009-1010", "1013", "1031", "10341035", "1037", "1039", "1042", "1044", "1051", "1055", "1057-1059", "1061", "1065-1066", "10681069", "1072", "1077", "1079-1080", "1090", "1108", "1115", "1128", "1132", "1143", "1145",
"1162", "1169", "1172-1173", "1175-1177", "1179", "1181-1185", "1187-1188", "1193-1194",
"1196", "1198", "1203", "1213", "1217", "1223", "1225", "1246", "1253", "1256-1259", "1262",
"1264-1265", "1273", "1278", "1287", "1289-1290", "1294-1296", "1299", "1308", "1326", "1335",
"1338", "1340-1341", "1344", "1347-1349", "1353", "1379-1380", "1382", "1386-1387", "13891391", "1396", "1401-1403", "1408", "1410", "1412", "1418", "1420", "1422", "1424", "1428",
"1432-1433", "1436", "1438", "1443", "1447", "1449", "1451", "1456-1459", "1462", "1464", "1467",
"1470", "1474-1475", "1479-1480", "1483", "1485", "1487", "1495", "1497", "1499", "1501", "1504",
"1507", "1509", "1528", "1530", "1534", "1537-1538", "1540", "1543-1544", "1553", "1555", "1562",
"1568", "1572", "1576-1578", "1613", "1617", "1619", "1636", "1638", "1641", "1659", "1667",
"1687", "1689-1690", "1697-1698", "1700", "1710-1712", "1716", "1718", "1722", "1731", "1736",
"1740", "1750", "1758-1760", "1784", "1798", "1801", "1806", "1810", "1814", "1820", "1824",
"1828", "1835", "1841-1842", "1847", "1857", "1861", "1868", "1876", "1881", "1886", "1918",
"1922", "1925-1926", "1928-1929", "1932", "1936", "1939-1940", "1942", "1946", "1949", "1969",
"1976", "1983-1984", "1996", "2002", "2008", "2013", "2019", "2033", "2035-2037", "2063-2064",
"2070", "2073", "2081", "2107", "2110"]
["11", "41-42", "45-47", "49", "51", "53", "58", "109", "148", "190", "252", "258-259", "263", "266- 388 18.10%
["0-20", "67-76", "79", "205122
5.69%
["11",
25
1.17%
267", "269", "292", "295", "304", "306-308", "311-313", "315", "333", "337-338", "342", "345",
207", "209-212", "329-339",
"333",
"348", "360", "371", "377-378", "380-382", "384-385", "388-390", "393", "396", "400-401", "405",
"440-443", "828-830", "832"337"408", "411", "420", "426", "430", "432", "434", "445-446", "451", "454", "457", "463-465", "467",
835", "955-962", "965-967",
338",
"470", "473", "476", "478-480", "484-485", "487", "489", "491", "494-497", "513", "515-519", "522",
"978-990", "1128-1132", "1376"828"525", "530", "534", "538-539", "542-544", "547", "551", "557", "559", "563", "565", "567", "5701380", "1591-1594", "1681",
829",
571", "573", "576", "578", "580", "582", "585", "587", "590-591", "593", "598", "600-601", "607"1747-1750", "1754-1757",
"832",
608", "613-616", "619-620", "626-627", "632", "634-636", "638", "642-643", "650-652", "658",
"1938", "1940-1942", "1996"834"663", "668-670", "672-675", "677-679", "683", "696-697", "701", "703", "709", "712", "714-717",
2001", "2070-2073"]
835",
"721-724", "728", "733", "741", "743", "747", "758", "761", "764", "768", "772", "775-776", "779",
"956"787-788", "796-797", "802", "805", "808-809", "818-819", "821-822", "826-829", "832", "834-835",
957",
"837-838", "841-847", "849-850", "858", "864", "875", "880", "905-906", "913-914", "916", "918",
"966",
"925", "927-930", "935-936", "938", "956-957", "966", "972-974", "977", "982", "984-985", "987"982",
988", "990-991", "994", "998", "1000", "1010", "1029-1030", "1032", "1034-1036", "1040", "1043",
"984-
147
CHPV
FLAV
HIRV
IHNV
ISFV
LNYV
"1045", "1048", "1051-1052", "1054", "1056-1059", "1061", "1064", "1067", "1082-1083", "1087",
"1108-1109", "1113", "1119", "1122", "1129", "1131", "1135", "1159", "1163", "1181", "1186",
"1193", "1195-1202", "1211", "1213", "1218", "1221", "1223", "1225", "1232", "1235-1236", "1238",
"1247-1250", "1252-1254", "1256-1258", "1260-1261", "1263", "1265", "1293-1295", "1297",
"1299", "1302", "1322", "1324", "1330", "1332", "1356", "1376", "1390", "1400", "1417", "1419",
"1422", "1434", "1450", "1458", "1473", "1485", "1489", "1499", "1505", "1510", "1517-1518",
"1540", "1543", "1614-1615", "1693-1694", "1696", "1717", "1719", "1721-1722", "1727", "1734",
"1737", "1739", "1750-1751", "1755-1757", "1771", "1780-1782", "1788", "1790", "1803-1804",
"1810", "1816", "1832", "1839", "1841-1842", "1847", "1859", "1865", "1870", "1873", "1878-1879",
"1881", "1898", "1986", "2009"]
["9", "30-31", "34-36", "38", "40", "42", "44", "88", "128", "166", "218", "224-225", "229", "232",
377
"235", "257", "260", "269", "271-273", "276-278", "280", "298", "302-303", "307", "310", "313",
"324", "335", "341-342", "344-346", "348-349", "352-354", "357", "360", "364-365", "369", "372",
"375", "384", "390", "394", "396", "398", "409-410", "415", "418", "421", "427-429", "431", "434",
"437", "440", "442-444", "448-449", "451", "453", "455", "458-461", "477", "479-483", "486", "489",
"494", "498", "502-503", "506-508", "511", "515", "521", "523", "527", "529", "531", "534-535",
"537", "540", "542", "544", "546", "549", "554-555", "557", "562", "564-565", "571-572", "577-580",
"583-584", "590-591", "596", "598", "600", "602", "606-607", "614-616", "622", "627", "632-634",
"636-639", "641-643", "647", "660-661", "665", "667", "673", "676", "678-681", "685-688", "692",
"697", "705", "707", "711", "722", "725", "728", "732", "736", "739-740", "743", "751-752", "761",
"766", "769", "772-773", "782-783", "785-786", "790-793", "796", "798-799", "801-802", "805-811",
"813-814", "822", "828", "839", "844", "870-871", "878", "883", "890", "892-895", "900-901", "903",
"922", "931", "937-939", "942", "947", "949-950", "952-953", "955-956", "959", "963", "965", "975",
"994-995", "999-1001", "1005", "1008", "1010", "1013", "1016-1017", "1019", "1021-1024", "1026",
"1029", "1032", "1047-1048", "1052", "1071-1072", "1076", "1082", "1085", "1092", "1094", "1098",
"1120", "1124", "1140", "1142", "1147", "1154", "1156-1163", "1172", "1174", "1179", "1182",
"1186", "1193", "1196-1197", "1199", "1208-1211", "1213-1215", "1217-1219", "1221-1222",
"1224", "1226", "1254-1256", "1258", "1260", "1263", "1283", "1285", "1291", "1293", "1336",
"1348", "1358", "1375", "1377", "1380", "1392", "1408", "1431", "1443", "1447", "1457", "1463",
"1468", "1475-1476", "1501", "1504", "1569-1570", "1635-1636", "1638", "1659", "1661", "16631664", "1669", "1676", "1679", "1681", "1692-1693", "1697-1699", "1713", "1722-1724", "1730",
"1732", "1745-1746", "1752", "1758", "1774", "1781", "1783-1784", "1801", "1807", "1812", "1815",
"1820-1821", "1823", "1839", "1927", "1944"]
["38-39", "42-44", "46", "48", "50", "55", "99", "141", "184", "241", "247-248", "252", "255-256",
386
"258-259", "281", "284", "293", "295-297", "300-302", "304", "322", "326-327", "331", "334", "337",
"348", "359", "365-366", "368-370", "372-373", "376-378", "381", "384", "388-389", "393", "396",
"399", "408", "414", "418", "420", "422", "433-434", "439", "442", "445", "451-453", "455", "458",
"461", "464", "466-468", "472-475", "477", "479", "482-485", "501", "503-507", "510", "518", "522",
"526-527", "530-532", "535", "539", "545", "547", "551", "553", "555", "558-559", "561", "564",
"566", "568", "570", "573", "578-579", "581", "586", "588-589", "595-596", "601-604", "607-608",
"614", "620", "622-624", "626", "630", "638-640", "646", "651", "656-658", "660-663", "665-667",
"671", "685-686", "689-690", "697", "700", "702-705", "709-712", "716", "721", "729", "733", "735",
"746", "749", "752", "756", "763-764", "767", "775-776", "785", "790", "793", "796-797", "806-807",
"809-810", "814-817", "820", "822-823", "825-826", "829-835", "837-838", "846", "852", "863",
"868", "894-895", "902-903", "905", "907", "914", "916-919", "924-925", "927", "945-946", "955",
"961-963", "966", "971", "973-974", "976-977", "979-980", "983", "987", "989", "999", "1018-1019",
"1021", "1023-1025", "1029", "1032", "1034", "1037", "1040", "1043", "1045", "1047-1048", "1050",
"1053", "1056", "1068", "1071-1072", "1076", "1097-1098", "1102", "1108", "1111", "1118", "1120",
"1124", "1146", "1150", "1168", "1173", "1179-1180", "1182-1189", "1195-1196", "1198", "1200",
"1205", "1208", "1210", "1212", "1219", "1221-1223", "1225", "1234-1237", "1239-1241", "12431244", "1247-1248", "1250", "1252", "1280-1282", "1284", "1286", "1289", "1309", "1311", "1317",
"1319", "1343", "1363", "1375", "1385", "1402", "1404", "1407", "1419", "1435", "1443", "14581459", "1470", "1474", "1484", "1490", "1495", "1502-1503", "1529", "1532", "1596-1597", "16621663", "1665", "1686", "1688", "1690-1691", "1696", "1703", "1706", "1708", "1719-1720", "17241726", "1740", "1749-1751", "1757", "1759", "1772-1773", "1779", "1785", "1801", "1808", "18101811", "1827", "1833", "1838", "1841", "1846-1847", "1849", "1866", "1954-1955", "1973"]
[]
0
[]
0
["9", "30-31", "34-36", "38", "40", "42", "44", "88", "128", "166", "217", "223-224", "228", "231",
377
"234", "256", "259", "268", "270-272", "275-277", "279", "297", "301-302", "306", "309", "312",
"323", "334", "340-341", "343-345", "347-348", "351-353", "356", "359", "363-364", "368", "371",
"374", "383", "389", "393", "395", "397", "408-409", "414", "417", "420", "426-428", "430", "433",
"436", "439", "441-443", "447-448", "450", "452", "454", "457-460", "476", "478-482", "485", "488",
"493", "497", "501-502", "505-507", "510", "514", "520", "522", "526", "528", "530", "533-534",
"536", "539", "541", "543", "545", "548", "550", "553-554", "556", "561", "563-564", "570-571",
"576-579", "582-583", "589-590", "595", "597", "599", "601", "605-606", "613-615", "621", "626",
"631-633", "635-638", "640-642", "646", "659-660", "664", "666", "672", "675", "677-680", "684687", "691", "696", "704", "706", "710", "721", "724", "727", "731", "735", "738-739", "742", "750751", "760", "765", "768", "771-772", "781-782", "784-785", "789-792", "795", "797-798", "800801", "804-810", "812-813", "821", "827", "838", "843", "870-871", "878", "883", "890", "892-895",
"900-901", "903", "922", "931", "937-939", "942", "947", "949", "952-953", "955-956", "959", "963",
"965", "975", "994-995", "999-1001", "1005", "1008", "1010", "1013", "1016-1017", "1019", "10211024", "1026", "1029", "1032", "1047-1048", "1052", "1071-1072", "1076", "1082", "1085", "1092",
"1094", "1098", "1120", "1124", "1142", "1147", "1154", "1156-1163", "1172", "1174", "1179",
"1182", "1186", "1193", "1196-1197", "1199", "1208-1211", "1213-1215", "1217-1219", "12211222", "1224", "1226", "1254-1256", "1258", "1260", "1263", "1283", "1285", "1291", "1293",
"1336", "1348", "1358", "1375", "1377", "1380", "1392", "1408", "1416", "1431", "1443", "1447",
"1457", "1463", "1468", "1475-1476", "1501", "1504", "1569-1570", "1635-1636", "1638", "1659",
"1661", "1663-1664", "1669", "1676", "1679", "1681", "1692-1693", "1697-1699", "1713", "17221724", "1730", "1732", "1745-1746", "1752", "1758", "1774", "1781", "1783-1784", "1801", "1807",
"1812", "1815", "1820-1821", "1823", "1839", "1927", "1945"]
[]
0
18.02%
["1-12", "465-482", "577",
"579", "1059-1063", "10931107", "1147-1161", "12231234", "1365-1371", "13731374", "1453-1459", "1466",
"1527-1532", "1615", "16291631", "1688-1701", "17221725", "1940-1945", "20872091"]
135
6.45%
18.70%
["0-14", "870-873", "1157",
"1161-1169", "1171-1172",
"1175", "1178-1185", "17191721", "2051-2063"]
56
2.71%
0.00%
["1-2", "4", "15-28", "30-70",
"142-144", "146-147", "342",
"344", "396-399", "445-452",
"531-542", "550-558", "837842", "858-861", "1100-1120",
"1224-1233", "1235-1236",
"1311-1323", "1344-1346",
"1504-1505", "1507", "15751578", "1758-1767", "1979",
"1981-1985"]
["1-2", "4", "31-58", "61-70",
"88-92", "180-185", "396-399",
"431-439", "443-451", "535542", "550-558", "845-854",
"858-860", "938-942", "10921093", "1098-1099", "1101",
"1103-1104", "1106-1119",
"1226-1231", "1233", "13151324", "1498-1499", "15011507", "1578", "1760-1766",
"1889-1898", "1979", "19821985"]
["0-27", "184-186", "464-482",
"718", "723", "1149-1162",
"1225-1227", "1371", "14141417", "1450-1457", "15241533", "1535", "1615", "16861702", "1912", "2083-2092"]
180
["0-19", "53-62", "102", "144149", "181-182", "475-477",
"480-481", "483-484", "486495", "625", "629", "631", "634637", "640-645", "697-701",
"1125-1129", "1131-1138",
0.00%
18.01%
0.00%
985",
"987988",
"990",
"1129",
"1131",
"1376",
"1750",
"17551757"]
["9",
"477",
"479482",
"577",
"579",
"1094",
"1098",
"1147",
"1154",
"11561161",
"1224",
"1226",
"1457",
"16921693",
"16971699",
"17221724",
"1944"]
30
1.43%
["1168",
"11791180",
"11821185",
"17191720"]
9
0.44%
9.06%
[]
0
0.00%
179
9.01%
[]
0
0.00%
122
5.83%
["9",
"476",
"478482",
"1154",
"11561162",
"1226",
"1416",
"1457",
"16921693",
"16971699"]
23
1.10%
143
6.91%
[]
0
0.00%
148
MFSV
[]
0
0.00%
MMV
[]
0
0.00%
MOKV
NCMV
RABV
RYSV
SCRV
SNAKV
["8", "30-31", "34-36", "38", "40", "42", "44", "95", "140", "184", "240", "246-247", "251", "254",
381
"257", "279", "282", "291", "293-295", "298-300", "302", "320", "324-325", "329", "332", "335",
"347", "358", "364-365", "367-369", "371-372", "375-377", "380", "383", "387-388", "392", "395",
"398", "407", "413", "417", "419", "421", "432-433", "438", "441", "444", "450-452", "454", "457",
"460", "463", "465-467", "471-472", "474", "476", "478", "481-484", "500", "502-506", "509", "512",
"516", "521", "525-526", "529-531", "534", "538", "544", "546", "550", "552", "554", "557-558",
"560", "563", "565", "567", "569", "572", "577-578", "580", "585", "587-588", "594-595", "600-603",
"606-607", "613-614", "619", "621-623", "625", "629-630", "637-639", "645", "650", "655-657",
"659-662", "664-666", "670", "685-686", "690", "692", "698", "701", "703-706", "710-713", "717",
"722", "730", "732", "736", "747", "750", "753", "757", "761", "764-765", "768", "776-777", "785786", "791", "794", "797-798", "807-808", "810-811", "815-818", "821", "823-824", "826-827", "830836", "838-839", "847", "853", "860", "869", "894-895", "902", "907", "914", "916-919", "924-925",
"927", "945-946", "955", "961-963", "966", "971", "973-974", "976-977", "979-980", "983", "987",
"989", "999", "1018-1019", "1023-1025", "1029", "1032", "1034", "1037", "1040-1041", "1043",
"1045-1048", "1050", "1053", "1056", "1071-1072", "1076", "1093-1094", "1098", "1104", "1107",
"1114", "1116", "1120", "1143", "1148", "1163", "1165", "1170", "1177", "1179-1186", "1195",
"1197", "1202", "1205", "1209", "1216", "1219-1220", "1222", "1232-1235", "1237-1239", "12411243", "1245-1246", "1248", "1250", "1280-1282", "1284", "1286", "1289", "1311", "1313", "1319",
"1321", "1364", "1376", "1386", "1403", "1405", "1408", "1420", "1436", "1459", "1471", "1475",
"1485", "1491", "1496", "1503-1504", "1530", "1533", "1608-1609", "1680-1681", "1683", "1704",
"1706", "1708-1709", "1714", "1721", "1724", "1726", "1737-1738", "1742-1744", "1759", "17681770", "1776", "1778", "1791-1792", "1798", "1804", "1819", "1826", "1828-1829", "1834", "1846",
"1852", "1857", "1860", "1865-1866", "1868", "1885", "1975", "1994"]
[]
0
17.91%
["8", "30-31", "34-36", "38", "40", "42", "44", "140", "184", "240", "246-247", "251", "254", "257", 383
"279", "282", "291", "293-295", "298-300", "302", "320", "324-325", "329", "332", "335", "347",
"356", "358", "364-365", "367-369", "371-372", "375-377", "380", "383", "387-388", "392", "395",
"398", "407", "411", "413", "417", "419", "421", "432-433", "438", "441", "444", "450-452", "454",
"457", "460", "463", "465-467", "471-472", "474", "476", "478", "481-484", "500", "502-506", "509",
"516", "521", "525-526", "529-531", "534", "538", "544", "546", "550", "552", "554", "557-558",
"560", "563", "565", "567", "569", "572", "577-578", "580", "585", "587-588", "594-595", "600-603",
"606-607", "613-614", "619", "621", "623", "625", "629-630", "637-639", "645", "650", "655-657",
"659-662", "664-666", "685-686", "690", "692", "698", "701", "703-706", "710-713", "717", "722",
"730", "732", "736", "747", "750", "753", "757", "761", "764-765", "768", "776-777", "786", "791",
"794", "797-798", "807-808", "810-811", "815-818", "821", "823-824", "826-827", "830-836", "838839", "847", "853", "860", "869", "894-895", "902-903", "905", "907", "914", "916-919", "924-925",
"927", "945-946", "961-963", "966", "971", "973", "976-977", "979-980", "983", "987", "989", "999",
"1018-1019", "1021", "1023-1025", "1028-1029", "1032", "1034", "1037", "1040-1041", "1043",
"1045-1048", "1050", "1053", "1056", "1071-1072", "1076", "1093-1094", "1098", "1104", "1114",
"1116", "1120", "1143", "1160", "1165", "1170", "1176-1177", "1179-1186", "1195", "1197", "1202",
"1205", "1209", "1216", "1219-1220", "1222", "1232-1235", "1237", "1239", "1241-1243", "12451246", "1248", "1280-1282", "1284", "1286", "1289-1290", "1311", "1313", "1316", "1319", "1321",
"1325", "1338", "1345", "1376", "1386", "1403", "1405", "1420", "1436", "1444", "1459", "1471",
"1475", "1485", "1488-1489", "1491", "1496", "1498", "1503-1504", "1530", "1533", "1608-1609",
"1680-1681", "1683", "1704", "1706", "1708-1709", "1714", "1721", "1724", "1726", "1737-1738",
"1742-1744", "1759", "1768-1770", "1776", "1778", "1791-1792", "1798", "1804", "1819", "1826",
"1828-1829", "1846", "1852", "1857", "1860", "1866", "1868", "1885", "1975", "1982", "1994"]
[]
0
17.88%
["11", "33-34", "37-39", "41", "43", "45", "47", "93", "132", "171", "230", "236-237", "241", "244- 385
245", "247-248", "269", "272", "281", "283-285", "288-290", "292", "310", "314-315", "319", "322",
"325", "336", "347", "353-354", "356-358", "360-361", "364-366", "369", "372", "376-377", "381",
"384", "387", "396", "402", "406", "408", "410", "421-422", "427", "430", "433", "439-441", "443",
"446", "449", "452", "454-456", "460-463", "465", "467", "470-473", "489", "491-495", "498", "506",
"510", "514-515", "518-520", "523", "527", "533", "535", "539", "541", "543", "546-547", "549",
"552", "554", "556", "558", "561", "566-567", "569", "574", "576-577", "583-584", "589-592", "595596", "602", "608", "610-612", "614", "618", "626-628", "634", "639", "644-646", "648-651", "653655", "659", "672-673", "677", "679", "685", "688", "690-693", "697-700", "704", "709", "717",
"721", "723", "734", "737", "740", "744", "751-752", "755", "763-764", "773", "778", "781", "784785", "794-795", "797-798", "802-805", "808", "810-811", "813-814", "817-823", "825-826", "834",
"840", "851", "856", "880-881", "888-889", "891", "893", "900", "902-905", "910-911", "913", "931932", "941", "947-949", "952", "957", "959-960", "962-963", "965-966", "969", "973", "975", "985",
"1004-1005", "1009-1011", "1015", "1018", "1020", "1023", "1026", "1029", "1031", "1033-1034",
"1036", "1039", "1042", "1054", "1057-1058", "1062", "1080-1081", "1085", "1091", "1094", "1101",
"1103", "1107", "1123", "1127", "1145", "1150", "1157", "1159-1166", "1172-1173", "1175", "1177",
"1182", "1185", "1187", "1189", "1196", "1198-1200", "1202", "1211-1214", "1216-1218", "12201221", "1224-1225", "1227", "1229", "1257-1259", "1261", "1263", "1266", "1286", "1288", "1294",
"1296", "1339", "1351", "1361", "1377", "1379", "1382", "1394", "1410", "1418", "1433-1434",
"1445", "1449", "1459", "1462", "1465", "1470", "1477-1478", "1504", "1507", "1568-1569", "16341635", "1637", "1658", "1660", "1662-1663", "1668", "1675", "1678", "1680", "1691-1692", "16961698", "1712", "1721-1723", "1729", "1731", "1744-1745", "1751", "1757", "1772", "1779", "17811782", "1799", "1805", "1810", "1813", "1818-1819", "1821", "1838", "1923-1924", "1939"]
[]
0
18.53%
0.00%
0.00%
0.00%
"1209-1210", "1275-1285",
"1503-1513", "1520-1521",
"1595", "1597-1598", "1601",
"1603-1604", "1607", "1610",
"1612-1615", "1622-1630",
"1636", "1638-1640", "1642",
"2059", "2063", "2066-2067"]
["0-2", "7-9", "18-25", "452458", "572-589", "668", "717728", "984", "986-988", "1146",
"1218-1227", "1557-1569",
"1660-1668", "1937-1939",
"1941-1943"]
["0-12", "14-36", "375-379",
"853-873", "966", "1211-1212",
"1217", "1221", "1223-1225",
"1232", "1242", "1255-1257",
"1260-1262", "1318-1326",
"1685-1686", "1692-1693",
"1819-1824"]
["1-20", "22-23", "25", "57-67",
"106-112", "470-474", "476479", "481-482", "491-504",
"519-520", "522-523", "13721375", "1579-1586", "16141625", "1648-1656", "17461747", "2117-2119", "21232124"]
95
4.89%
[]
0
0.00%
97
5.05%
[]
0
0.00%
110
5.17%
["8",
"471472",
"474",
"476",
"478",
"481482",
"500",
"502504"]
12
0.56%
["1-3", "24", "75", "268-269",
"427-434", "438-444", "550554", "691-692", "1006-1009",
"1085-1089", "1119-1125",
"1127-1130", "1141", "12781304", "1364-1366", "15751586", "1591-1599", "19781984", "1993-1999", "2051",
"2053-2057"]
["1-32", "476-479", "513-518",
"1360-1362", "1368", "13721378", "1578-1589", "1598",
"1615-1624", "1647-1651",
"1737", "1902-1905", "2135",
"2139"]
121
5.88%
[]
0
0.00%
88
4.11%
["8", "3031",
"476",
"478",
"516",
"1376",
"1737"]
8
0.37%
["0-15", "18-20", "22-37", "4041", "58", "64", "312-316",
"384-393", "593-595", "733738", "861-867", "976", "978990", "1148-1149", "11511155", "1157-1159", "12181243", "1325-1326", "1329",
"1853", "1953-1966"]
["1-4", "6", "137-148", "150",
"435", "484", "583-589", "591597", "611-613", "619", "734737", "854-855", "1148-1163",
"1223-1238", "1309-1321",
"1324-1336", "1339-1340",
"1527-1532", "1536-1537",
"1540-1543", "1605-1622",
"1689-1690", "1693-1701",
"1814-1817", "1935", "20702077"]
138
7.02%
[]
0
0.00%
158
7.60%
["583584",
"589",
"591592",
"595596",
"611612",
"734",
"737",
"1150",
"1157",
"11591163",
"12241225",
"1227",
"1229",
"1339",
"16961698"]
26
1.25%
["1-2", "4", "13-14", "37-49",
145
7.31%
[]
0
0.00%
149
SVCV
SYNV
["11", "33-34", "37-39", "41", "43", "45", "47", "90", "129", "165", "215", "221-222", "226", "229- 387
230", "232-233", "256", "259", "268", "270-272", "275-277", "279", "297", "301-302", "306", "309",
"312", "323", "334", "340-341", "343-345", "347-348", "351-353", "356", "359", "363-364", "368",
"371", "374", "383", "389", "393", "395", "397", "408-409", "414", "417", "420", "426-428", "430",
"433", "436", "439", "441-443", "447-450", "452", "454", "457-460", "476", "478-482", "485", "493",
"497", "501-502", "505-507", "510", "514", "520", "522", "526", "528", "530", "533-534", "536",
"539", "541", "543", "545", "548", "553-554", "556", "561", "563-564", "570-571", "576-579", "582583", "589", "595", "597", "599", "601", "605", "613-615", "621", "626", "631-633", "635-638",
"640-642", "646", "659-660", "664", "666", "672", "675", "677-680", "684-687", "691", "696", "704",
"708", "710", "721", "724", "727", "731", "738-739", "742", "750-751", "760", "765", "768", "771772", "781-782", "784-785", "789-792", "795", "797-798", "800-801", "804-810", "812-813", "821",
"827", "838", "843", "869-870", "877-878", "880", "882", "889", "891-894", "899-900", "902", "920921", "930", "936-938", "941", "946", "948-949", "951-952", "954-955", "958", "962", "964", "974",
"993-994", "996", "998-1000", "1004", "1007", "1009", "1012", "1015", "1018", "1020", "10221023", "1025", "1028", "1031", "1043", "1046-1047", "1051", "1069-1070", "1074", "1080", "1083",
"1090", "1092", "1096", "1118", "1122", "1140", "1145", "1152", "1154-1161", "1167-1168", "1170",
"1172", "1177", "1180", "1182", "1184", "1191", "1193-1195", "1197", "1206-1209", "1211-1213",
"1215-1216", "1219-1220", "1222", "1224", "1252-1254", "1256", "1258", "1261", "1281", "1283",
"1289", "1291", "1315", "1334", "1346", "1356", "1373", "1375", "1378", "1390", "1406", "1408",
"1413", "1428-1429", "1440", "1444", "1454", "1457", "1460", "1465", "1472-1473", "1498", "1501",
"1565-1566", "1631-1632", "1634", "1655", "1657", "1659-1660", "1665", "1672", "1675", "1677",
"1688-1689", "1693-1695", "1709", "1718-1720", "1726", "1728", "1741-1742", "1748", "1754",
"1770", "1777", "1779-1780", "1797", "1803", "1808", "1811", "1816-1817", "1819", "1835", "19241925", "1942"]
[]
0
TVCV
[]
0
VHSV
[]
0
VSIV
VSNJV
VSSJV
["11", "33-34", "37-39", "41", "43", "45", "47", "91", "131", "171", "228", "234-235", "239", "242", 384
"245", "267", "270", "279", "281-283", "286-288", "290", "308", "312-313", "317", "320", "323",
"334", "345", "351-352", "354-356", "358-359", "362-364", "367", "370", "374-375", "379", "382",
"385", "394", "400", "404", "406", "408", "419-420", "425", "428", "431", "437-439", "441", "444",
"447", "450", "452-454", "458-459", "461", "463", "465", "468-471", "487", "489-493", "496", "499",
"504", "508", "512-513", "516-518", "521", "525", "531", "533", "537", "539", "541", "544-545",
"547", "550", "552", "554", "556", "559", "564-565", "567", "572", "574-575", "581-582", "587-590",
"593-594", "600-601", "606", "608-610", "612", "616-617", "624-626", "632", "637", "642-644",
"646-649", "651-653", "657", "670-671", "675", "677", "683", "686", "688-691", "695-698", "702",
"707", "715", "717", "721", "732", "735", "738", "742", "746", "749-750", "753", "761-762", "770771", "776", "779", "782-783", "792-793", "795-796", "800-803", "806", "808-809", "811-812", "815821", "823-824", "832", "838", "849", "854", "880-881", "888-889", "891", "893", "900", "902-905",
"910-911", "913", "931-932", "941", "947-949", "952", "957", "959-960", "962-963", "965-966",
"969", "973", "975", "985", "1004-1005", "1009-1011", "1015", "1018", "1020", "1023", "10261027", "1029", "1031-1034", "1036", "1039", "1042", "1057-1058", "1062", "1081-1082", "1086",
"1092", "1095", "1102", "1104", "1108", "1130", "1134", "1150", "1152", "1157", "1164", "11661173", "1182", "1184", "1189", "1192", "1194", "1196", "1203", "1206-1207", "1209", "1218-1221",
"1223-1225", "1227-1229", "1231-1232", "1234", "1236", "1264-1266", "1268", "1270", "1273",
"1293", "1295", "1301", "1303", "1346", "1358", "1368", "1385", "1387", "1390", "1402", "1418",
"1441", "1453", "1457", "1467", "1473", "1478", "1485-1486", "1511", "1514", "1579-1580", "16461647", "1649", "1670", "1672", "1674-1675", "1680", "1687", "1690", "1692", "1703-1704", "17081710", "1724", "1733-1735", "1741", "1743", "1756-1757", "1763", "1769", "1785", "1792", "17941795", "1800", "1812", "1818", "1823", "1826", "1831-1832", "1834", "1850", "1938", "1956"]
["8", "33-34", "37-39", "41", "43", "45", "47", "91", "131", "171", "228", "234-235", "239", "242",
380
"245", "267", "270", "279", "281-283", "286-288", "290", "308", "312-313", "317", "320", "323",
"334", "345", "351-352", "354-356", "358-359", "362-364", "367", "370", "374-375", "379", "382",
"385", "394", "400", "404", "406", "408", "419-420", "425", "428", "431", "437-439", "441", "444",
"447", "450", "452-454", "458-459", "461", "463", "465", "468-471", "487", "489-493", "496", "499",
"504", "508", "512-513", "516-518", "521", "525", "531", "533", "537", "539", "541", "544-545",
"547", "550", "552", "554", "556", "559", "564-565", "567", "572", "574-575", "581-582", "587-590",
"593-594", "600-601", "606", "608", "610", "612", "616-617", "624-626", "632", "637", "642-644",
"646-649", "651-653", "657", "670-671", "675", "677", "683", "686", "688-691", "695-698", "702",
"707", "715", "717", "721", "732", "735", "738", "742", "746", "749-750", "753", "761-762", "771",
"776", "779", "782-783", "792-793", "795-796", "800-803", "806", "808-809", "811-812", "815-821",
"823-824", "832", "838", "849", "854", "880-881", "888", "893", "900", "902-905", "910-911", "913",
"931-932", "941", "947-949", "952", "957", "959-960", "962-963", "965-966", "969", "973", "975",
"985", "1004-1005", "1009-1011", "1015", "1018", "1020", "1023", "1026-1027", "1029", "10311034", "1036", "1039", "1042", "1057-1058", "1062", "1081-1082", "1086", "1092", "1095", "1102",
"1104", "1108", "1130", "1134", "1150", "1152", "1157", "1164", "1166-1173", "1182", "1184",
"1189", "1192", "1194", "1196", "1203", "1206-1207", "1209", "1218-1221", "1223-1225", "12271229", "1231-1232", "1234", "1236", "1264-1266", "1268", "1270", "1273", "1293", "1295", "1301",
"1303", "1346", "1358", "1368", "1385", "1387", "1390", "1402", "1418", "1441", "1453", "1457",
"1467", "1473", "1478", "1485-1486", "1511", "1514", "1579-1580", "1646-1647", "1649", "1670",
"1672", "1674-1675", "1680", "1687", "1690", "1692", "1703-1704", "1708-1710", "1724", "17331735", "1741", "1743", "1756-1757", "1763", "1769", "1785", "1792", "1794-1795", "1800", "1812",
"1818", "1823", "1826", "1831-1832", "1834", "1850", "1938", "1956"]
["11", "33-34", "37-39", "41", "43", "45", "47", "91", "131", "171", "228", "234-235", "239", "242", 386
"245", "267", "270", "279", "281-283", "286-288", "290", "308", "312-313", "317", "320", "323",
"334", "345", "351-352", "354-356", "358-359", "362-364", "367", "370", "374-375", "379", "382",
"385", "394", "400", "404", "406", "408", "419-420", "425", "428", "431", "437-439", "441", "444",
"447", "450", "452-454", "458-459", "461", "463", "465", "468-471", "487", "489-493", "496", "499",
"504", "508", "512-513", "516-518", "521", "525", "531", "533", "537", "539", "541", "544-545",
"217-221", "338", "340-346",
"349", "440-452", "551-557",
"612-618", "701-706", "708",
"845-853", "855", "857", "914915", "997-1004", "1040-1043",
"1073-1075", "1083-1084",
"1217-1234", "1304-1312",
"1495-1500", "1882-1892",
"1978-1982"]
18.47% ["0-20", "307", "309-310", "312315", "416", "475-476", "10571064", "1148-1156", "12201233", "1237", "1269-1277",
"1307-1308", "1328", "13711372", "1436-1438", "14401441", "1443", "1449-1466",
"1523-1531", "1534-1551",
"1624-1627", "1686-1698",
"1984-1986", "1988-1995",
"2052", "2072-2094"]
180
8.59%
["11",
"309",
"312",
"476",
"1152",
"11541156",
"1220",
"1222",
"1224",
"1440",
"1454",
"1457",
"1460",
"1465",
"16881689",
"16931695"]
21
1.00%
["1-8", "16-27", "29-30", "388394", "518-519", "703-719",
"759-768", "772-773", "906913", "1012-1016", "11721173", "1176-1182", "16351662", "1664-1669", "17551762", "2109", "2111-2115"]
0.00%
["0-38", "310", "952", "955961", "963-969", "1078-1086",
"1215-1216", "1218-1221",
"1226-1230", "1233", "12491258", "1261-1268", "13191324", "1671", "1821-1827",
"1917-1927"]
0.00%
["1-8", "12-18", "339-349",
"449", "548-556", "842", "850863", "1134", "1145", "12151236", "1321-1322", "13541366", "1368", "1714-1716",
"1749", "1756-1758", "1819",
"1822-1830", "1977", "19801983"]
18.21% ["0-2", "4-18", "190-201", "314319", "426", "484", "487-488",
"491", "1108-1109", "11591173", "1222", "1231-1240",
"1458-1460", "1462-1475",
"1539-1541", "1591-1600",
"1602-1605", "1701", "17031705", "1707-1716", "17671774", "1951", "1953-1956",
"2008-2009", "2012-2030",
"2102-2108"]
130
6.14%
[]
0
0.00%
119
6.17%
[]
0
0.00%
113
5.70%
[]
0
0.00%
158
7.49%
["11",
"317",
"487",
"491",
"1108",
"1164",
"11661173",
"12311232",
"1234",
"1236",
"1467",
"1473",
"17031704",
"17081710",
"1769",
"1956"]
27
1.28%
18.02%
["0-4", "7-15", "413", "483",
"488", "585-588", "1104-1115",
"1117", "1159-1172", "12171218", "1222", "1231-1245",
"1317-1320", "1380-1384",
"1463-1473", "1531", "1556",
"1628-1633", "1700-1716",
"1733", "1944-1960", "20202023", "2106-2108"]
136
6.45%
["8",
"587588",
"1104",
"1108",
"1164",
"11661172",
"1218",
"12311232",
"1234",
"1236",
"1467",
"1473",
"17031704",
"17081710",
"1733",
"1956"]
27
1.28%
18.30% ["0-2", "4-18", "190-201", "314319", "426", "475-476", "488",
"491", "1108-1109", "11591173", "1222", "1231-1240",
"1458-1460", "1462-1475",
"1539-1541", "1589-1594",
152
7.21%
["11",
"317",
"491",
"1108",
"1164",
"1166-
26
1.23%
0.00%
150
"547", "550", "552", "554", "556", "559", "564-565", "567", "572", "574-575", "581-582", "587-590",
"593-594", "600-601", "606", "608-610", "612", "616-617", "624-626", "632", "637", "642-644",
"646-649", "651-653", "657", "670-671", "675", "677", "683", "686", "688-691", "695-698", "702",
"707", "715", "717", "721", "732", "735", "738", "742", "746", "749-750", "753", "761-762", "770771", "776", "779", "782-783", "792-793", "795-796", "800-803", "806", "808-809", "811-812", "815821", "823-824", "832", "838", "849", "854", "880-881", "888", "893", "900", "902-905", "910-911",
"913", "931-932", "941", "947-949", "952", "957", "959-960", "962-963", "965-966", "969", "973",
"975", "985", "1004-1005", "1007", "1009-1011", "1015", "1018", "1020", "1023", "1026-1027",
"1029", "1031-1034", "1036", "1039", "1042", "1057-1058", "1062", "1081-1082", "1086", "1092",
"1095", "1102", "1104", "1108", "1130", "1134", "1150", "1152", "1157", "1164", "1166-1173",
"1179", "1182", "1184", "1189", "1192", "1196", "1203", "1206-1207", "1209", "1218-1221", "12231225", "1227-1229", "1231-1232", "1234", "1236", "1264-1266", "1268", "1270", "1273", "1293",
"1295", "1301", "1303", "1327", "1346", "1358", "1368", "1385", "1387", "1390", "1402", "1418",
"1426", "1441-1442", "1453", "1457", "1467", "1473", "1478", "1485-1486", "1511", "1514", "15791580", "1646-1647", "1649", "1670", "1672", "1674-1675", "1680", "1687", "1690", "1692", "17031704", "1708-1710", "1724", "1733-1735", "1741", "1743", "1756-1757", "1763", "1769", "1785",
"1792", "1794-1795", "1800", "1812", "1818", "1823", "1826", "1831-1832", "1834", "1850", "1938",
"1956"]
"1596-1599", "1603-1605",
"1701", "1703-1705", "17071716", "1767-1774", "19531956", "2008-2024", "21022108"]
1173",
"12311232",
"1234",
"1236",
"1467",
"1473",
"17031704",
"17081710",
"1769",
"1956"]
C. Filoviridae L Polymerase
Name
MARV
CICP CICP CICP %
Position
#
[]
0
0.00%
REBOV
[]
0
0.00%
SEBOV
[]
0
0.00%
ZEBOV
[]
0
0.00%
Disorder Position
["1-8", "10-12", "761-762", "848-851", "1458-1471", "1562-1567", "1690-1704", "1706-1707", "1748", "1750-1806",
"1808-1809", "1811", "1840", "1842-1863", "2042-2044", "2046-2049", "2322", "2324-2330"]
["1-12", "164-168", "515", "614-619", "683-700", "703-704", "756-766", "1064-1069", "1201-1206", "1433-1449", "16031604", "1608", "1610-1611", "1647-1721", "1736-1755", "1770-1771", "1773-1774", "1776-1777", "2205-2211"]
["1-13", "165-168", "687-699", "757-759", "763", "1202-1205", "1434-1449", "1602-1618", "1649-1752", "1769-1783",
"1917-1918", "1929", "2202-2209"]
["1-12", "163-169", "339-340", "605", "683-708", "756-766", "1064-1067", "1203-1205", "1435-1450", "1649-1720",
"1728-1731", "1733-1752", "1769-1782", "1910-1914", "1932", "2105-2110", "2205-2207", "2209", "2211"]
Disorder
#
154
Disorder
%
6.61%
Both
Position
[]
Both # Both
%
0
0.00%
198
8.95%
[]
0
0.00%
202
9.14%
[]
0
0.00%
210
9.49%
[]
0
0.00%
D. Bornaviridae L Polymerase
Name
BDV
CICP
Position
[]
CICP # CICP %
0
Disorder Position
0.00% ["1-2", "4-5", "755-756", "758-760", "1018", "1032", "1097", "1102-1108", "1110-1111", "1429-1431", "1445-1448"]
Disorder Disorder % Both Position Both #
#
28
1.74%
[]
0
Both %
0.00%
151
APPENDIX D
SUPPLEMENTARY TABLE 3.2
152
Supplementary Table 3.2 List of predicted Disordered and CICP residues for each viruses
P protein. The numbers in the Disorder Regions and CICP Regions columns correspond
to the unaligned residue position(s) of each sequence. A.) Paramyxoviridae B.)
Rhabdoviridae C.) Filoviridae D.) Bornaviridae. The table columns are: Name - the
abbreviated name or the virus (see Methods), CICP positions – the location of the CICP
residues corresponding to the sequence position, CICP # - the total number of CICPS for
the sequence, CICP % - the percentage of CICP positive residues in the sequence,
Disorder Positions – the location of the disordered residues corresponding to sequence
position, Disorder # - the total number of disordered amino acids in the sequence,
Disorder % - the percentage of disordered residues in the sequence, Both Positions –
residue position that are positive for both CICP and disorder in the sequence, Both # - the
total number of residues that are both disordered and a CICP in the sequence, Both % the percentage of residues that are both disordered and a CICP in the sequence.
A. Paramyxoviridae Phosphoprotein
Name
AVPMV6
AVPNV
BEIV
CICP Position
CICP CICP %
Disorder Position
Disorder
#
#
0
0.00% ["0-31", "52-54", "56", "178", "183-187", "189-191", "241",
47
"245"]
[]
0
0.00% ["1-8", "15-17", "30-31", "35-38", "40", "47-136", "138-167",
212
"169", "197-203", "228-293"]
["8", "19", "21", "35", "39", "46", "65", "73", "76-78", "84", "91- 51 19.54% ["0-18", "20-60", "127-129", "131-140", "149-151", "153",
105
93", "98", "101", "103", "105-107", "109", "116-117", "121-122",
"155-167", "245", "247-260"]
"124", "128", "134", "139", "141-145", "154-155", "159", "182185", "189", "194", "205", "230-231", "235", "241-242", "246"]
[]
BPIV
[]
0
BRSV
[]
0
CDV
["8", "18", "21", "96", "105", "115-119", "123", "125", "127",
"129", "132-133", "137", "140", "142-144", "146-148", "157",
"161-163", "166", "176-177", "204", "215-217", "222", "261",
"264"]
38
DMV
33
12.00%
FDLV
["8", "19", "21", "35", "96", "98", "104", "110", "113", "118-119",
"123", "126", "128-129", "136-137", "142-143", "162-168",
"177", "211", "215", "220", "253", "261", "267"]
[]
0
0.00%
GPV
[]
0
0.00%
HEND
["21", "25", "43", "114-115", "128", "139-140", "142-144", "150",
"155", "157", "161", "168-169", "171", "174", "178", "180", "188190", "192", "194-195", "231-232", "235-236", "243-245", "250",
"256-257", "284", "286", "289", "294-295"]
42
13.73%
HMPV
[]
0
0.00%
HPIV1
HPIV2
HPIV3
[]
[]
[]
0
0
0
0.00%
0.00%
0.00%
HRSVA2
HRSVB1
HRSVS2
[]
[]
[]
0
0
0
0.00%
0.00%
0.00%
JV
["17", "19", "21", "25", "31", "35", "46", "63", "65-66", "73",
"76", "85", "90-93", "98-103", "105", "108-109", "115-117",
"119", "121-122", "128", "142-144", "155", "180-181", "183",
"185", "189-190", "194", "196", "202-203", "205", "235", "240241", "244"]
["60", "82", "84", "89", "93", "119", "148", "152", "212-213",
"218"]
["21", "35", "84-86", "92-93", "96", "104", "110-111", "113",
"118-119", "121-123", "125", "129", "135-139", "142", "160",
"162", "164-166", "176-177", "201", "207", "209", "215", "217",
"220", "222", "228-229", "231", "256", "261", "266-268"]
["19", "39", "103", "114", "121", "123", "129", "134", "136-137",
"141", "147", "154", "157", "160", "164", "171", "175-178",
"181", "190-191", "237", "248-249", "258", "273-274", "277",
"281-282", "284", "287", "292"]
52
19.70%
["21", "49-50", "67", "69", "71-74", "76", "83-84", "86-89",
"101", "110", "115", "122", "124", "127-128", "148-149", "151153", "173", "180", "184-185", "212-213", "216", "218"]
MENV
MeV
MOSV
MuV
Disorder Both Position Both Both %
%
#
19.11%
[]
0
0.00%
72.11%
[]
0
0.00%
40.23%
10
3.83%
88
35.34%
["8", "21",
"35", "39",
"46", "128",
"134", "139",
"155", "159"]
[]
0
0.00%
168
69.71%
[]
0
0.00%
156
56.52%
14
5.07%
["0-81", "144", "146", "148-162", "180-208", "218-230",
"232-238", "240-247", "274"]
157
57.09%
6
2.18%
["0-14", "37-48", "57", "141-163", "167-175", "178-219",
"233-238", "261-264"]
["0-42", "76", "79-85", "125-126", "128-196", "213", "215218"]
["0-73", "178-193", "245-247", "249", "252-254", "288",
"292-293", "296", "299-305"]
112
42.26%
["8", "18",
"21", "146",
"157", "161162", "176177", "204",
"215-217",
"222"]
["8", "19",
"21", "35",
"162", "220"]
[]
0
0.00%
127
57.99%
[]
0
0.00%
108
35.29%
10
3.27%
["1-7", "26-75", "81", "89-170", "172-173", "197", "199204", "228-249", "254-293"]
["0-7", "84", "86-95", "113-198", "211-212", "246-250"]
["0-44", "115", "151-171", "211-212", "215-217", "220"]
["0-5", "23-25", "116-134", "136-140", "146-164", "166172", "186-195", "212-216", "242-247", "249"]
["0-16", "25-40", "46-135", "156-163", "188-240"]
["0-16", "25-40", "46-135", "156-163", "188-240"]
["0-16", "25-40", "46-135", "156-163", "184-185", "187240"]
["0-20", "23", "31", "40-51", "53-63", "123-144", "147-150",
"152", "154-174", "176-179", "192-193", "248-263"]
211
71.77%
["21", "25",
"43", "178",
"180", "188190", "192",
"245"]
[]
0
0.00%
112
73
81
44.62%
33.03%
32.40%
[]
[]
[]
0
0
0
0.00%
0.00%
0.00%
184
184
187
76.35%
76.35%
77.59%
[]
[]
[]
0
0
0
0.00%
0.00%
0.00%
116
43.94%
10
3.79%
0.00% ["0-5", "89", "94", "114-139", "143-152", "157", "160", "162194", "239-242", "244-248"]
0.00% ["0-16", "18", "26-43", "45-74", "87-98", "101", "104-106",
"108-125", "127-134", "157-161", "184", "187-240"]
13.77% ["0-80", "145-146", "149-162", "176-227", "240-244", "274275"]
11
4.95% ["0-38", "130-133", "135-136", "151-170", "199", "214-221"]
74
["17", "19",
"31", "46",
"63", "128",
"142-144",
"155"]
33.33% ["152", "218"]
2
0.90%
47
17.03% ["0-2", "4-5", "18-43", "52-78", "143-144", "146", "149-162",
"184-200", "222-228", "232", "238-240"]
103
37.32%
["21", "35",
"160", "162",
"222", "228"]
6
2.17%
36
12.04% ["0-42", "65-70", "74", "84", "86", "89-93", "95-98", "160",
"162-216", "218-220", "225-245", "263-275", "294-298"]
159
53.18%
15
5.02%
36
16.29%
99
44.80%
["19", "39",
"160", "164",
"171", "175178", "181",
"190-191",
"237", "273274"]
["21", "110",
"115", "148149", "151-
9
4.07%
["0-41", "107-118", "132-142", "144-173", "206", "211",
"217", "220"]
153
NDV
[]
0
72
32.88%
45
0.00% ["0-40", "150-152", "154", "156-163", "165-167", "182-185",
"188-193", "195-196", "213", "215", "217-218"]
14.71%
["0-75", "178-194", "246-256", "296", "299-305"]
NIPH
["25", "29", "114-115", "128", "132", "142-143", "149", "151152", "155", "157", "161", "163", "166-167", "169", "174", "178",
"182", "184", "188-194", "199", "230", "232", "236", "238",
"243", "245", "250", "256-257", "262", "266", "281", "289", "294295"]
["21", "35", "39", "84-86", "90", "93", "96", "104-105", "110111", "118-119", "122-123", "125", "129", "135-137", "142",
"144", "156", "160-163", "165-167", "176-177", "204", "207208", "217", "220", "228-229", "231", "256", "261", "266-267",
"272"]
112
36.60%
47
16.91%
["0-9", "19-35", "39", "48-50", "52-76", "152-153", "156162", "164-166", "180-207", "209-211", "216", "218-228",
"240-243", "274-277"]
119
42.81%
PDV
["19", "34-35", "84", "96", "100", "105", "118", "123", "129",
"136-137", "139-140", "142-143", "146", "160-165", "177",
"204", "208", "210", "215", "217", "220", "253", "267", "269"]
33
11.96% ["0-78", "80", "149-162", "177-192", "206-207", "216-226",
"243-245", "275"]
127
46.01%
PNVM15
[]
0
0.00%
[]
0
0.00%
176
["19", "39", "96", "105", "118", "120", "123", "129", "136-137",
"139", "142-143", "156", "160", "162-164", "199", "202", "204",
"208", "215", "217", "220", "222", "253", "267", "269"]
29
["0-17", "21", "23-47", "53-132", "155-180", "205-208",
"247", "249", "251-252", "255-256", "279-289", "291-294"]
["0-17", "21", "23-47", "52-132", "155-180", "205-208",
"247", "249", "251-252", "255-256", "279-289", "291-294"]
10.51% ["0-78", "143-146", "149-162", "176-197", "199-203", "208227", "240-241", "275"]
175
PNVMJ366
6
RPV
[]
[]
0
0
0.00%
0.00%
PDPR
RSV
SENV
SPIV41
[]
0
SPIV5
["19", "23", "37", "44", "47", "50", "60", "67", "69", "73", "81",
"83", "88", "93-94", "101", "103", "110", "119", "128", "148149", "152-153", "164", "167", "173", "175", "182", "185", "202",
"212-213", "216-218"]
["20-21", "23", "25", "47", "49-50", "58", "60", "69-70", "72-73",
"81", "84", "86", "88-90", "94", "98", "100-101", "103", "106",
"110-112", "115", "122", "126-129", "147", "149", "151-152",
"173", "180", "182", "184-185", "208", "212-213", "218"]
36
["0-16", "25-40", "46-135", "156-164", "186-240"]
["0-7", "114-167", "169-203", "208-209", "211-215", "246250"]
0.00% ["0-14", "38", "103", "105-117", "123-127", "133-140", "152171", "220"]
16.29%
["0-16", "107-109", "162", "165-173"]
47
21.27%
["19", "31", "35", "39", "111", "122", "124", "131", "144", "149",
"155", "162-163", "168-169", "173", "178", "182", "184-187",
"231-232", "237", "242", "275", "283", "289"]
29
9.73% ["1-5", "24-36", "52-113", "165-166", "168-169", "171-193",
"195-245", "293-297"]
TIOV
TUPV
["0-41", "80", "150-171", "205-211", "213-218", "220"]
153", "173"]
[]
0
0.00%
14
4.58%
13
4.68%
9
3.26%
59.32%
["25", "29",
"178", "182",
"184", "188194", "250",
"256"]
["21", "35",
"39", "156",
"160-162",
"165-166",
"204", "207",
"220", "228"]
["19", "3435", "160162", "177",
"217", "220"]
[]
0
0.00%
59.66%
[]
0
0.00%
147
53.26%
13
4.71%
187
109
77.59%
43.43%
["19", "39",
"143", "156",
"160", "162",
"199", "202",
"208", "215",
"217", "220",
"222"]
[]
[]
0
0
0.00%
0.00%
64
28.96%
[]
0
0.00%
30
13.57% ["167", "173"]
2
0.90%
79
35.75%
9
4.07%
163
54.70%
16
5.37%
["20-21",
"23", "25",
"151-152",
"208", "213",
"218"]
["31", "35",
"111", "168169", "173",
"178", "182",
"184-187",
"231-232",
"237", "242"]
B. Rhabdoviridae Phosphoprotein
Name CICP Position CICP #
ABLV
[]
0
BEFV
[]
0
CHPV
[]
0
FLAV
[]
0
HIRV
[]
0
IHNV
[]
0
ISFV
[]
0
LNYV
[]
0
MFSV
[]
0
MMV
[]
0
MOKV
[]
0
NCMV
[]
0
RABV
[]
0
RYSV
[]
0
SCRV
[]
0
SNAK
[]
0
SVCV
[]
0
SYNV
[]
0
TVCV
[]
0
VHSV
[]
0
VSIV
[]
0
VSNJ
[]
0
VSSJ
[]
0
CICP %
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
Disorder Position
["0-7", "38-45", "47-51", "54-89", "131-194", "196", "294-296"]
["0-10", "19-20", "22-56", "120", "122-136", "171-197", "274", "276"]
["0-14", "16-86", "113-117", "167-210", "289-292"]
["0-6", "14-17", "21", "24-47", "49", "52-78", "122-156", "217-230"]
["1-12", "22", "25-64", "103-105", "115-122", "144-155", "186", "189-197", "206-207", "222-226"]
["0-12", "20-63", "65-66", "103-108", "115-122", "144-157", "189-194", "204-210", "224-226", "228-229"]
["1-2", "4-6", "9-16", "18-19", "23-38", "58-73", "75", "164-214", "226-231", "286-288"]
["0-59", "137-140", "182-187", "189", "193-194", "196-231", "233", "296-297", "299"]
["0-23", "28-42", "46", "49", "51", "58-59", "64-82", "84", "196-245", "286", "297", "299-300", "329-337"]
["0-36", "38-66", "68-71", "160-164", "166-168", "170-171", "263-268"]
["0-7", "9", "38-89", "92", "133", "135-199", "213-214", "298", "300-302"]
["0-13", "37-53", "185-190", "276-285"]
["1-7", "37-89", "132-194", "196", "212-214", "248", "290-291", "293-296"]
["0-42", "45", "48-50", "52-53", "57", "59-125", "215-224", "257-266", "269", "289", "292-302", "315-321"]
["1-5", "7", "29-86", "88-90", "92-98", "153-197", "199", "280-281"]
["0-9", "15-19", "22-26", "31-35", "43-44", "46-52", "91-119", "138-148", "177-181", "217-219"]
["0-6", "8", "14-24", "26-108", "110-114", "185-236", "307-308"]
["1-2", "171-179", "182-239", "243-244", "253-285"]
["0-42", "44-45", "163", "268", "270"]
["1-3", "22-59", "61", "91", "94-101", "107-120", "135-151", "199-205", "209", "216-221"]
["1-2", "4", "21", "34", "36-95", "97-98", "106-134", "180-181", "183-184", "208-214", "262-263"]
["1-2", "4-5", "14-85", "89", "92", "94", "102-104", "106", "112-120", "170-197"]
["1-2", "4", "21", "34", "36-95", "97-98", "106-134", "180-181", "183-184", "208-214", "262-263"]
Disorer # Disorder % Both Position Both #
125
42.09%
[]
0
93
33.45%
[]
0
139
47.44%
[]
0
113
48.92%
[]
0
93
40.97%
[]
0
105
45.65%
[]
0
108
37.37%
[]
0
113
37.67%
[]
0
127
37.57%
[]
0
86
31.97%
[]
0
134
44.22%
[]
0
47
16.43%
[]
0
134
45.12%
[]
0
157
48.76%
[]
0
122
43.26%
[]
0
82
37.27%
[]
0
161
52.10%
[]
0
104
36.36%
[]
0
48
17.71%
[]
0
96
43.24%
[]
0
109
41.13%
[]
0
120
43.80%
[]
0
109
41.13%
[]
0
Both %
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
Disorer # Disorder % Both Position Both #
55
16.72%
[]
0
86
26.14%
[]
0
86
26.14%
[]
0
127
37.35%
[]
0
Both %
0.00%
0.00%
0.00%
0.00%
C. Filoviridae Phosphoprotein
Name CICP Position CICP #
MARV
[]
0
REBO
[]
0
SEBO
[]
0
ZEBO
[]
0
CICP %
0.00%
0.00%
0.00%
0.00%
Disorder Position
["1-3", "25-46", "162", "178-182", "291-292", "294-309", "323-328"]
["1-9", "40-66", "152", "156-168", "175-176", "185", "193-198", "287-289", "294-310", "322-328"]
["0-14", "42-69", "104", "106", "152-159", "161-168", "175-176", "285-289", "294-302", "305-307", "323-328"]
["0-27", "55-79", "162-183", "185-215", "306-313", "315-319", "332-339"]
154
D. Bornaviridae Phosphoprotein
Name
BDV
CICP Position
[]
CICP #
0
CICP %
0
Disorder Position
["0-74", "124-125", "170-200"]
Disorer #
108
Disorder %
53.73%
Both Position
[]
Both #
0
Both %
0
Download