Final Paper-Lucas Rizkalla

advertisement
Lucas Rizkalla
Introduction:
Bacteriophages are increasingly being studied as an alternative to antibiotic
resistant bacteria because of their unique properties and revolutionary therapeutic
advances. Phages could be used to treat bacterial infections in which all other methods
had been exhausted (Svoboda 2009 Popular Science). Phages are difficult to predict
because of their capability to mutate, but are nonetheless on the rise as a potential
replacement to antibiotics with increased amount of antibiotic resistance bacteria. Phages
aren’t harmful to human cells, but can be harmful if mutations arise that create proteins
detrimental to the immune system. For example, lysogenic phages are known to encode
genes that produce toxins. Some include toxins that cause cholera, hemorrhagic diarrhea,
botulism, diphtheria, and scarlet fever (Nicol 2003 Microbial Genetics). Phages can be
used to learn more about the bacteria themself because phages grow and change in
response to the change in the bacteria. Thus, studying phages can give more knowledge
about changes in the bacteria (Svoboda 2009 Popular Science). Because of this, phages
provide a useful scope of knowledge to implement progressive phage therapy in the
future.
Bacillus are a fairly common bacteria in the world around us, present in the food
we eat and causes many human diseases. Members of B. anthracis, B. cereus, and B.
thuringiensis are of particular interest because of their very similar genome sequences,
but different pathogen profiles. B. anthracis is known for causing the lethal anthrax
disease, B. cereus as a food contaminant and human pathogen, and B. thuringiensis as an
insect pathogen for bioinsecticides. This difference is accounted for by various
mechanisms of horizontal gene transfer. Members of the B. cereus group are associated
with a variety of bacteriophages, making it possible to study a diverse range of phages
that can provide new information about members of this group. Studying phages that
infect members of these groups can expand the knowledge of bacteria today and push
research forward to moving toward phage therapy (Gillis 2014 Viruses). The B. cereus
group is important in studying a wide range of phages to provide answers for the future of
phage therapy. Observing how phages perform in conjunction with these bacteria can
give knowledge about how different phages with different properties interact with a range
of bacteria.
Phages kill bacteria through lysis accomplished by the endolysin protein. Bacillus
bacteria are important because of their gram-positive structure, making it a good
candidate to study phages for endolysin treatment. Endolysin might be an efficient
protein-based antibiotic because gram-positive bacteria lack the outer membrane layer of
gram-negative bacteria and can be directly targeted by endolysin for destruction
(Svoboda 2009 Popular Science). Endolysin is a potential protein that can be used on its
own without the whole phage. It could serve as an antibacterial treatment against grampositive bacteria infections applied directly to the infection. In a study performed by
Rizkalla 2
Nelson et. al, an endolysin isolated from the B. anthracis gamma phage was shown to
rescue 13 of 19 mice in an intraperitoneal model of infection. The enzyme was able to
remain fully active after heated for an hour to a temperature of 60˚ C. In addition, a
second Bacillus phage PlyPH was shown to have relatively high lytic activity over a
broad range of pH (Nelson 2012 Advances in Virus Research). The specific properties of
Bacillus endolysins make them useful for therapeutic treatment. Analyzing the conserved
domains of endolysins in different bacteria will give insight into which endolysin
properties are necessary for lysis.
This paper aims to identify conserved domains in Bacillus phage endolysins. The
collected sequences of phage endolysins can be put in a multiple sequence alignment to
observe phylogenetic characteristics of the phages. Furthermore, the phylogenetic tree
can give insight into how conserved domains of endolysin are related. This paper’s focus
is to analyze trends of conserved domains in the phylogenetic tree to understand how
endolysin can be specified in phage therapeutic treatment.
1. N-acetyl muramidase
2. Lytic Transglycosylase
3. N-acetylmuramoyl- L-alanine amidase
7. D-alanyl-D-alanine carboxypeptidase
Can’t find N-acetyl-anhydromuranmyl-L-alanine
amidase
Figure 1. Various lytic sites of endolysin protein.
Rizkalla 3
Table 3. Endolysin Functional Domains
Phage


Troll 

Jugalone 

Phrodo 

Waukesha 
B4
Gamma 

SPO1 


Taylor 

Curly 

Gemini 

Vinny 


Evoli 


HoodyT 


Nigalana 


NotTheCreek



SageFayge


DIGNKC 


Zuko 


Catalytic Domain
Cell Wall Targeting Domain
N-acetylmuramoyl-L-alanine amidase
D-alanyl-D-alanine carboxypeptidase
N-acetylmuramoyl-L-alanine amidase
D-alanyl-D-alanine carboxypeptidase
N-acetylmuramoyl-L-alanine amidase
D-alanyl-D-alanine carboxypeptidase

SH3

SH3

SH3
N-acetylmuramoyl-L-alanine amidase
D-alanyl-D-alanine carboxypeptidase
N-acetylmuramoyl- L-alanine amidase

SH3

SH3
N-acetylmuramoyl-L-alanine amidase

PGRP

N-acetylmuramoyl-L-alanine amidase

D-alanyl-D-alanine carboxypeptidase
Spore cortex-lytic enzyme
N-acetylmuramoyl-L-alanine amidase

Membrane-bound lytic murein transglycosylase D
N-acetylmurmoyl-l-alanine amidase
N-acetylmuramoyl-L-alanine amidase

Membrane-bound lytic murein transglycosylase D
LysM
SafA
N-acetylmuramoyl-L-alanine amidase

Membrane-bound lytic murein transglycosylase D
LysM
SafA
N-acetylmuramoyl-L-alanine amidase
Glycosyl Hydrolase
1,4-beta-N-acetylmuramidase (Lyzozyme M1)
N-acetylmuramoyl-L-alanine amidase
Glycosyl Hydrolase
1,4-beta-N-acetylmuramidase (Lyzozyme M1)
N-acetylmuramoyl-L-alanine amidase
Glycosyl Hydrolase
1,4-beta-N-acetylmuramidase (Lyzozyme M1)
N-acetylmuramoyl-L-alanine amidase
N-acetyl-anhydromuranmyl-L-alanine amidase
PGRP
N-acetylmuramoyl-L-alanine amidase
N-acetyl -anhydromuranmyl-L-alanine amidase
PGRP
N-acetylmuramoyl-L-alanine amidase
N-acetyl-anhydromuranmyl-L-alanine amidase
PGRP
N-acetylmuramoyl-L-alanine amidase
N-acetyl-anhydromuranmyl-L-alanine amidase
PGRP
N-acetylmuramoyl-L-alanine amidase
N-acetyl-anhydromuranmyl-L-alanine amidase
PGRP
Putative peptidoglycan-binding
LysM
SafA


N-acetylmuramoyl- L-alanine amidase


N-acetylmuramoyl- L-alanine amidase


N-acetylmuramoyl- L-alanine amidase


SH3


SH3


SH3


SH3


SH3
Rizkalla 4
Methods:
The genomic DNA of bacteriophage Phrodo was sequenced by MiSeq next
generation sequencing technology. Sequencing reads were assembled using Newbler
software. The genome assembly was visualized by Consed. Physical genome ends
containing terminal repeats were identified by identifying a region with double the reads.
Gene start positions were auto-predicted using Genemark (Besemer 2005) and Glimmer
(Slazburg et al.) inside the DNAMaster software (http://phagesdb.org/DNAMaster/). The
start positions were confirmed based on a combination of a 1:1 blastp match, the highest
shine dalgarno prediction score, whether the gene covers all the coding potential, and the
size of the gaps and overlaps between genes. Phamerator (Cresawn 2011), blastp
(https://blast.ncbi.nlm.nih.gov/), and HHPred (Soding 2005) were used to determine
predicted functions of each gene.
Table 1. Genome Characteristics of Bacillus Phages
Characteristic
Phrodo Jugalone
Zuko
DIGNKC
Troll
Troll
% Query Coverage
94
96
91
92
33
% Identity
98
99
88
87
74
37.8
37.80
38.60
38.70
30.3
164227
163345
161552
26504
Best Blast Match
% GC
Genome Length, bp 164443
Megatron Hakuna
Claudi
MG-B1
# Predicted ORFs
288
293
296
296
48
# Predicted tRNAs
0
0
3
3
0
Table 2. Average Nucleotide Identity
NotThe
Claudi DIGNKC Jugalone Nigalana Creek
Phrodo SageFayge Vinny Zuko
100
55.5
55.6
54.7
54.8
55.9
54.8
55
55.2
Claudi
55.5
100
61.2
87.1
87.1
61.3
87.7
62.8
90.6
DIGNKC
55.6
61.2
100
61.4
61.3
95.7
60.8
66.8
61.2
Jugalone
54.7
87.1
61.4
100
92.3
61.6
91.9
62.1
87
Nigalana
54.8
87.1
61.3
92.3
100
61.5
92.9
62.6
87.8
NotTheCreek
55.9
61.3
95.7
61.6
61.5
100
61
66.8
61.2
Phrodo
54.8
87.7
60.8
91.9
92.9
61
100
62.8
88
SageFayge
55
62.8
66.8
62.1
62.6
66.8
62.8
100
62.4
Vinny
55.2
90.6
61.2
87
87.8
61.2
88
62.4
100
Zuko
Rizkalla 5
Phage genome sequences were compared by dot plot and average nucleotide
identity. Gepard was used to create a dot plot of phages, which compared the similarity
between two figures using dots to display the extent of their similarity (). DNAMaster
was used to calculate average nucleotide identity across the whole genome between many
phages, constructed into a table. The table shows the percent nucleotide identity between
two phages. Genome maps were examined using Phamerator to identify certain feature
and functions of genes. Phamerator was used in making a genome map of Phrodo, which
labels predicted functions of each gene.
Figure 1. Dot Plot of Sequenced Bacillus Phages.
Rizkalla 6
Phage endolysin protein sequences were put in ClustalW to identify their relative
evolutionary position using a phylogenetic tree. This phylogeny was constructed using
endolysin as the comparative factor. Phages endolysin sequences were taken from
Phamerator. Phamerator was also used to identify the proximity between holin and
endolysin, whether next to each other or far apart on the genome. The sequences were put
into blastp to identify conserved domains and were made into a table separating Nterminal and C-terminal domains. The endolysin sequences from newly found phages as
well as older phages were all put through multiple sequence alignment using ClustalW. A
phylogenetic tree was created in order to relate endolysin functional domains to phage
similarity in the phylogeny.
6
a
1
a
2
a
3
a
5
a
6
a
4
a
B. cereus=blue
B. subtilis=yellow
B. pumilus=green
B. anthracis=orange
B. thuringiensis=purple
=Endolysin & Holin close
Figure 2. The phages endolysin proteins were characterized with Clustal multiple sequence
alignment, calculated using neighbor joining using BLOSUM62, to create an endolysin
phylogenetic tree. The numbers indicate functional domain groups and the stars indicate
phages where endolysin and holing are contained within a defined lytic cassette.
Rizkalla 7
N-Terminal
1
N-acetylmuramoyl-l-alanine amidase
D-alanyl-d-alanine carboxypeptidase
2
N-acetylmuramoyl-l-alanine amidase
Membrane-bound lytic murein
transglycosylase D
3
N-acetylmuramoyl-l-alanine amidase
Gylcosyl hydrolase
1,4-beta-N-acetylmuramidase
4
N-acetylmuramoyl-l-alanine amidase
D-alanyl-d-alanine carboxypeptidase
Spore cortex-lytic enzyme
5
6
N-acetylmuramoyl-l-alanine amidase
PGRP
N-acetylmuramoyl-l-alanine amidase
N-acetyl-anhydromuranmyl-l-alanine
amidase
PGRP
C-Terminal
SH3
LysM
SafA
LysM
SafA
N-acetylmuramoyl-lalanine amidase
Putative peptidoglycan
binding
Putative peptidoglycan
binding
N-acetylmuramoyl-lalanine amidase (Gamma)
SH3 (Waukesha)
SH3
(Waukesha)
SH3
Figure 4. Functional domains of grouped phages from Figure 3. Annotation of N-terminal and Cterminal domains by blastp are shown.
Results:
The genomes of five phages infecting Bacillus thuringiensis were sequenced to
compare genome content. Four of the phages are myoviridae while one (Claudi) is a
podoviridae, resulting in differences in phage structure and genome sequences. All
myoviridae phage genomes are more than 160,000 bp long, while Claudi genome is
around 26,000 bp long (Table 1). Claudi has a relatively high GC content (30.3%) and is
within 8% of the other phages. Phrodo was sequenced and annotated to analyze its
genome features and structure. Phrodo is approximately 160,000 bps long in genome
length. In addition, the predicted number of ORFs is 288. According to Table 1, of the
five phages sequenced, Phrodo most closely resembles Jugalone, with both phages
sharing the best blast match being Troll. Phrodo and Jugalone have a higher %identity
Rizkalla 8
than %query coverage (98% to 94% and 99% to 96%). Zuko and DIGNKC share the
same amount of ORFs and tRNAs. In addition, Zuko’s best blast match was Megatron
while DIGNKC’s was Hakuna. As these facts demonstrate, Claudi is highly genomically
distinct compared to the other phages, which is because Claudi is a podovirus and the
other are myoviruses. Furthermore, Phrodo and Jugalone display high similarity to each
other, implying their close relationship between each other in a phylogeny.
The Bacillus phage genomes were compared by ANI calculation, dotplot analysis
and comparative genome map analysis for genome identity. The dot plot analysis of the
phages showed the relative size of Claudi’s genome compared to the rest of the
myoviridae phages (Figure 1). Furthermore, Claudi showed no overlapping similarity to
the genomes of the other phages, lacking any dots in each comparison. Combined, it’s
evident that Claudi is genomically different compared to the myoviruses. Phrodo and
Jugalone displayed extreme similarity to each other and only slight similarity to the other
phages. The dotplot revealed that DIGNKC, Zuko, Nigalana, NotTheCreek, and
SageFayge all shared high similarity to each other. The one phage that only displayed
slight similarity with the other phages was Vinny. Table 2 shows the average nucleotide
identity of phages on the dotplot. Phrodo and Jugalone were 95.7% similar in ANI. The
percent identity of Phrodo and Jugalone to other phages was around 60%. DIGNKC,
Zuko, Nigalana, NotTheCreek, and SageFayge share high similarity between each other
altogether. However, some individual phages share more similarity to each other than the
rest of the group. Zuko and DIGNKC are one example, being 90.6% similar to each
other, but around 87% similar to Nigalana, NotTheCreek, and SageFayge. Reversibly,
Nigalana, NotTheCreek, and SageFayge are about 92% similar in ANI, but around 87%
similar to Zuko and DIGNKC. Vinny is less than 70% similar to any other phage in the
table. The comparative genomes map of VCU phages (Appendix 1) was used to further
identify similarities and differences in genomic characteristics among the phages. As
previously looked at by the ANI table and dotplot, the genome map can also help support
and provide evidence for similarity or discrepancies. The phages were grouped according
to logical groupings based on those tables. It is evident that this map supports the fact that
NotTheCreek, SageFayge, Nemo, Nigalana, and Zuko share a lot of purple region
indicating high similarity. In addition, Phrodo and Jugalone have a large portion of purple
shading to support their similarity in genomes. There are several sites of recombination
that occur in both groupings that account for partial differences. Vinny shows little
similarity with the other phages as expected. Also, Claudi is limited in size and has little
similarity to the other phages. It can be concluded that the genome map helps in
identifying genomic similarities and differences to further distinguish or group phages.
Using a combination of dotplot, ANI calculation, and genome maps is useful in
determining relative phages by comparing their genomes to find similarity.
Rizkalla 9
The Phrodo genome map was obtained from Phamerator, showing functional gene
predictions as well as the different phams of each gene (Figure 3). The map shows the
terminal repeat region, consisting of three duplicate genes, and a non-coding region in
front. There were 62 genes that were predicted to have functions. A portion of the genes
was reversibly transcribed, from about 2,400 bps to 27,000 bps. The structural genes
were all located within the same section of the genome from around 54,200 bps to 67,500
bps. Usually next to each other, endolysin and holin were far apart from each other on the
Phrodo genome, implying evolutionary changes from their positions in other phages. The
genome also consisted of two novel genes, gp15 and gp106. These novel genes did not
appear in Jugalone, which account for minor differences in the genome. Furthermore, the
Phamerator map of Phrodo shows several locations of recombination. Some genes that
appeared in the genome compared to Jugalone include gp30, gp166, gp219, gp222,
Figure 5. Genome map of Phrodo showing functional gene predictions. The different colors of genes
represent different phams.
gp235, gp259, and gp262. The Phrodo genome also shows sites where genes were not
present in Phrodo but appeared in the Jugalone genome. These areas of white shading on
Phamerator imply regions of low nucleotide similarity and possible sites of mutations. In
addition to recombination, these areas of white shading add the slight difference in
nucleotide identity and similarity among genomes.
It is expected that endolysin and holin are co-regulated when found in a defined
lytic cassette. In Phrodo, these proteins are separated by half a genome length, and
Rizkalla 10
regulation of expression is unclear. We looked for predicted promoters in intergenic
regions, in the same direction as our genes of interest. Data found by Morgan Van Driest
Figure 6. Predicted sigma 32 promoter region for endolysin at 27,000bp (top) and
predicted sigma 70 promoter region for holin at 106,600bp (bottom). Promoter
locations are indicated with an orange arrow.
showed that the predicted promoter for endolysin maybe be controlled by a sigma 32
homolog while holin was predicted to be controlled by a sigma 70 homolog (Figure 6).
The predicted sigma 32 promoter was supported by the fact that terminase and endolysin
would be transcribed together, as genes required late in the virus lifecycle. This is logical
because terminase isn’t needed until the head and tail are assembled, and the head is
ready to be filled with DNA before connecting to the tail (Black 2012 Elsevier).
Endolysin follows by cleaving the cell wall, releasing mature phages. The sigma 70
promoter was predicted by comparing it to the Jugalone genome and looking for the
Rizkalla 11
closest area of non-coding region, present in both phages, with the reasoning that
expression of this essential protein would occur from a conserved promoter. The
promoter position was located near the 106,600bp region. Van Driest compared her
finding to phage Gamma, with endolysin and holin next to each other in a defined lytic
cassette. She predicted a sigma 32 promoter around the 7,700bp region, very similar to
the promoter predicted for expression of Phrodo’s endolysin gene.
Endolysin sequences from phages found this semester were added to a file
containing sequences of previously found phage endolysins taken from Phamerator.
Figure 7 depicts the resulting phylogenetic tree after multiple sequence alignment.
Endolysin are grouped on the tree according to host bacteria used for isolation. The
grouped phages on the tree were compared to the conserved functional domains (blastp)
of different phage endolysin from Table 3. The first group consisted of phages B4, Troll,
Phrodo, and Jugalone. Their functional domains were identical, with N-acetylmuramoylL-alanine amidase and D-alanyl-D-alanine carboxypeptidase at the N-terminal while also
having SH3 binding domain at its C-terminal. The second group included Taylor, Curly,
and Gemini. They also shared the same functional domains. Their N-terminal domains
entail N-acetylmuramoyl-L-alanine amidase and membrane-bound lytic murein
transglycosylase D, while their C-terminal domains consists of LysM and SafA. The third
group consisting of HoodyT, Vinny, and Evoli include N-acetylmuramoyl-L-alanine
amidase, glycosyl hydrolase, and 1,4-beta-N-acetylmuramidase (Lyzozyme M1) at their
N-terminus. Their C-termini include the functional domain of N-acetylmuramoyl-Lalanine amidase. SPO1 stands alone as the fourth group, with N-acetylmuramoyl-Lalanine amidase, D-alanyl-D-alanine carboxypeptidase, and spore cortex-lytic enzyme at
its N-terminal and putative peptidoglycan-binding at its C-terminal. The fifth group
consisting of Gamma and Waukesha has the same N-terminal functional domains of Nacetylmuramoyl-L-alanine amidase and PGRP, but have different C-terminal functions.
The sixth group containing DIGNKC, Zuko, SageFayge, Nigalana, and NotTheCreek
share N-acetylmuramoyl-L-alanine amidase, N-acetyl-anhydromuranmyl-L-alanine, and
PGRP at the N-terminal and SH3 at the C-terminal. Groups 1 and 6 are the only groups in
which all known members have only SH3 as their C-terminal protein. Although groups 1,
3, and 6 all belong to the B. thuringiensis family, group 3 is distinct in that it has a
cleavage domain at its C-terminal end, whereas the other two groups share a cell wallbinding protein. Phage Waukesha from group 5 has two SH3 proteins at its C-terminal,
but stands alone compared to Gamma, which has a cleavage protein at its C-terminal end.
The N-terminal end across the phylogeny is fairly conserved, including Nacetylmuramoyl-l-alanine amidase as a cleavage protein in all N-terminal domains.
Further research might implicate difference of cleavage and cell wall-binding domains
accounting for separation of groups.
Rizkalla 12
Discussion:
No one phage is completely identical to another, implying the large diversity of
genome sequences in phages. Phages with different morphology show very little
similarity between their genomes. In our collection of phages, we found four myoviridae
and one podoviridae, with significant similarity in genomic characteristics within the
myovirus group. In the Bacillus database, there are currently 521 documented phages,
and only 36 of those have been sequenced (bacillus.phagesdb.org). In comparison, the
NCBI virus genomes database contains 59 sequenced Bacillus phages, and 26 of which
are myoviridae, with 6 being podoviridae (ncbi.nlm.nih.gov/genome/viruses/). The
database suggests the average genome size and number of ORFs was 168,962 bps and
271 respectively for myoviridae. In podoviridae, the average genome size was 27,658 bps
and the number of ORFs was 37. These averages very closely resemble the genomic
characteristics in Table 2. All myovirus genome sizes in the table are around 160,000 bps
and have 290 ORFs. As Claudi is a podovirus, it is expected that the genome size is
significantly less and contains less ORFs. It is also noteworthy to mention to ratio of
myovirus to podovirus. Within the NCBI database, there was much less podoviruses than
myoviruses. Our data shows that of the 20 phages found, 19 were myoviruses and one
was a podovirus. Four of the myoviruses and the one podovirus were sequenced. Based
on the previous information, the myoviruses and podovirus found at VCU are
conclusively typical. The genome sizes and number of ORFs of VCU phages follow that
of the phages in the databases.
Phages that display high similarity among nucleotide sequences may prove to be
in the same evolutionary proximity. These phages may show possible similarity in a
cluster that results in significant similarity in their genomic characteristics. Previously, 60
mycobacteriophages were clustered according to genome similarity with gene location
(Hatfull 2010 JMB). Based on the results of our ANI, dotplot and comparative genome
map analysis, there are two logical groupings that can be made. The first is the grouping
of Zuko and DIGNKC. Their ANI identity was one of the highest (90.6%) and their
dotplot result showed that they were significantly similar. However, they are the most
similar to each other within another possible logical grouping. This bigger grouping
consists of Zuko, DIGNKC, NotTheCreek, Nigalana, and SageFayge. NotTheCreek,
Nigalana, and SageFayge all showed ANI of above 90% and were consistently similar
according to the dotplot and genome maps. However, the similarity between Zuko and
DIGNKC and NotTheCreek, Nigalana, and SageFayge was about 87-88% similar (Table
3). It’s conclusive that all five phages can be logically group together because of their
high comparative similarity, but separate within that group because of a higher similarity
between certain individuals.
Furthermore, Phrodo and Jugalone can be grouped together because of their
extensive genome similarity. They exhibited the highest ANI values, at 95.7% (Table 2).
Rizkalla 13
They also were the only phages that matched with each other on the dotplot (Figure 1).
Finally, comparison of whole genome maps shows overall nearly identical genomes with
small amount of recombination in comparison to the large size of the genomes. Both
genomes have similar terminal repeat regions, consisting of the same three genes, except
Jugalone contains an additional gene predicted in its terminal repeat. Phages can be
clustered where nucleotide identity is identified over large genome segments. According
to Hatfull, one way to identify this similarity is by analyzing a dot plot and determining
where two genomes show evident sequence similarity of more than 50%. Another way is
by analyzing ANI values. Those that are within 53%-59% are not clustered, whereas
those that exhibit high values of ANI can be ideally clustered (Hatfull 2010 JMB). It is
apparent that these grouping can be made because of the strong similarity between
phages or lack there of.
In contrast, Vinny and Claudi showed no similarity to any of the phages studied.
The dotplot analysis lacked any lines for Claudi (Figure 1). According to Hatfull, phages
not clustered together ranged between 53%-59% (Hatfull 2010 JMB). It’s safe to
conclude that Claudi could not be clustered with any phage in the table because its ANI
values were within the 53%-59% range, making Claudi a singleton. In contrast, Vinny
has diffuse dotting across the plot with the two larger groups. Comparison of whole
genome maps shows that Vinny shows many genes in the same pham as the rest of the
phages in the table. However, comparison using Phamerator showed little to no purple
shading to signify high nucleotide identity. Vinny ANI values were within the mid 60s,
suggesting that Vinny might have at one point been very similar to the rest of the phages,
but diverged due to evolution. Hatfull had placed phages that had ANI values in the range
of 63%-67% into subclusters, suggesting a possibility of putting Vinny in a subcluster
(Hatfull 2010 JMB). It can be concluded that Vinny might possibly be placed in a
subcluster because of its slightly higher range of ANI values, but Claudi remains a
singleton in this group of phages. The comparative genome maps further support
similarity between phages found by analysis through ANI and the dotplot. The maps can
show regions of genome similarity so that it can be characterized in a cluster. It can also
show similarity between distant genomes and see where they diverged from more closely
related phages (Hatfull 2010 JMB). As mentioned earlier, there were two logical
groupings of phages that could be made through analysis of ANI and the dotplot. The
phages from one group that were compared on the genome map included Zuko,
SageFayge, Nigalana, and NotTheCreek. These phages were already shown to have high
similarity between each other. However, Zuko and DIGNKC were slightly less similar to
the other phages in the group. By looking at the comparative genome map, it’s possible to
identify where the phages began to diverge into their own subgroup. In the Hatfull paper,
the genomes of phages were aligned and categorized into clusters and subclusters. The
comparative map helps in determining in which phages the genomes begin to diverge and
Rizkalla 14
become a subcluster (Hatfull 2010 JMB). By studying the genome maps, more
information can be found out about related phages.
Phrodo was annotated for functional gene predictions. There were 62 genes out of
288 genes that were predicted to have functions. A few genes were introduced into the
Phrodo genome due to recombination. Some of these genes were novel genes while other
were visible in other phages. Of the novel genes and genes seen in other phages found in
Phrodo, only two were predicted to have function. Interestingly, both were predicted to
function as HNH Endonuclease. The sigma factor was annotated to be located toward the
middle of the genome. In addition, Endolysin was found at the beginning of the genome,
while holin was further away.
Regulation of expression of endolysin and holin is presumed to be tightly
controlled. The promoter sequence data we predicted (Figure 6) suggests that endolysin
and holin that aren’t near each other in the genome don’t necessarily need the same
promoter. The difference of sigma factors in Phrodo shows that holin may be expressed
by a sigma 70 promoter (early gene expression), while endolysin may be expressed by a
sigma 32-like promoter (unknown time of expression, but different from early genes). In
comparison, we predicted a sigma 32-like promoter would be used to express the lytic
cassette from group 5 phages like gamma (Figure 2). Endolysin is the more significant of
the two since it controls cleavage. Holin is transcribed and accumulates in the membrane,
but without any effect on the membrane integrity. The sudden triggering of holin to form
pores in the membrane is timed by allelic specificity. Only then is endolysin capable of
destroying the cell wall. Since endolysin relies on the triggering of pore-formation caused
by holin, it can be transcribed at any time as long as it occurs before the triggering of
pore-formation (White 2010 PNAS). This supports our data that holin can either be
transcribed before endolysin (sigma 70 promoter), or at the same time as endolysin
(sigma 32-like promoter) because holin controls lysis timing.
It might be noteworthy to identify the N-terminal as the dependent factor. The
different functional domains of endolysin suggest contribution to the diversity among
groups in the phylogenetic tree. Because the N-terminal shows variability among groups,
it might suggest its key role of changes in the tree. A study performed by Becker revealed
that the SH3 domain enhances the role of the catalytic domain, but is not essential for
lysis activity. He also noted that the N-terminus of the protein plays an essential role in
cell lysis (Becker 2014 FEMS). Groups 1 and 6 share a cell wall-binding domain of SH3,
but stray from each other in terms of catalytic domain. This difference in catalytic
domain, but same CWB domain might suggest the separation of the two groups on the
tree. All groups share one common domain, but also have their own. This difference in
cleavage domain might also suggest something about host lysis ability and its range.
More research would have to be done to provide a relationship between cleavage
domains and host lysis range among these phages and to see how C-terminal domains
Rizkalla 15
might suggest a different grouping of the phylogenetic tree, possibly with groups 1 and 6
together. Future projects include creating domain phylogenies to understand the grouping
of the phages better.
Studying endolysin properties and characteristics, especially their functional
domains, might reveal their capabilities in the future of phage therapy. For example,
phage B4 endolysin consists of SH3 as a cell wall-binding protein and D-alanyl-d-alanine
carboxypeptidase as a cleavage protein, and appears in group 1 on our phylogeny (Figure
2). LysB4 was also experimentally characterized as an L-alanoyl-D-glutamate
endopeptidase, showing optimum temperature and pH at 50C and 8.5 respectively. The
broad host lysis activity of B4 suggests that it could be used as a biocontrol agent in
phage therapy against B. cereus (Lee 2013 Arch Virol). As an extension, a more fine
tuned comparison of group 1 endolysins to this published work might reveal the
importance of the N-terminus in optimization of host lysis. By understanding which
cleavage domains work well in specific host lysis, and which CWB domains enhance it
best, research can progress towards identifying the ideal combination of the two to
improve phage therapy.
Rizkalla 16
Works Cited
Besemer, J., and Borodovsky, M. (2005). GeneMark: web software for gene finding in
prokaryotes, eukaryotes, and viruses. Nucleic Acids Research. 33. 451-454.
doi:10.1093/nar/gki487
Black, L., and Rao, V. (2012). Structure, assembly, and DNA packaging of the
bacteriophage T4. Elsevier. 82. 119-147. doi: 10.1016/B978-0-12-3946218.00018-2
Becker, S., Swift, S., and Korobova, O. (2014). Lytic activity of the staphylolytic Towrt
phage endolysin CHAP domain is enhanced by the SH3b cell wall binding
domain. FEMS Microbiology Letters. 362. 1-8. doi: 10.1093/femsle/fnu019
Cresawn, S., Bogel, M., Day, N., Jacobs-Sera, D., Hendrix, R., and Hatfull, G. (2011).
Phamerator: a bioinformatic tool for comparative bacteriophage genomics. BMC
Bioinformatics. 12(395). 1-14. http://www.biomedcentral.com/1471-2105/12/39
Gillis, A., and Mahillon, J. (2014). Phages preying on Bacillus anthracis, Bacillus cereus,
and Bacillus thuringiensis: Past, Present and Future. Viruses. 6. 2623-2672.
www.mdpi.com/journal/viruses
Hatfull, G., Jacobs-Sera, D., and Lawrence, J. (2010). Comparative genomic analysis of
60 mycobacteriophage genomes: genome clustering, gene acquisition, and gene
size. Journal of Molecular Biology. 397. 119-143. doi:10.1016/j.jmb.2010.01.011
Lee, J., Shin, H., and Son, B. (2013). Characterization and complete genome sequence of
a virulent bacteriophage B4 infecting food-borne pathogenic Bacillus Cereus.
Archive of Virology. 158. 2101-2108. doi: 10.1007/s00705-013-1719-2.
Nelson, D., Schmelcher, M., and Rodrigues-Rubio, L. (2012). Chapter 7-Endolysins as
microbials. Advances in Virus Research. 83. 229-365. doi:10.1016/B978-0-12394438-2.00007-4
Nicol, K. (2003). Virulence factors carries on phages. Microbial Genetics. Web. 3 May.
2015. http://www.sci.sdsu.edu.
Salzburg, S., Delcher, A., Kasif, S., and White, O. (1998). Microbial gene identification
using interpolated Markov models, Oxford University Press. 26 (2). 544-588.
Soding, J., Biegert, A., and Lupas, A. (2005). The HHpred interactive server for protein
homology detection and structure prediction. Nucleic Acids Research. 33. 244248. doi:10.1093/nar/gki408
Svoboda, E. (2009). The next phage. Popular Science. Web
http://www.popsci.com/scitech/article/2009-03/next-phage
White, R., Chiba, S., and Pang, T. (2010). Holin triggering in real time. Proceeding of the
National Academy of Sciences of the United States of America. 108(2). 798-803.
doi: 10.1073/pnas.1011921108
Rizkalla 17
Appendix
1. Comparative genomics map of Phages:
https://blackboard.vcu.edu/bbcswebdav/pid-5371969-dt-content-rid14822722_2/courses/BNFO-252-001-2015Spring/VCU%20phages%20maps.pdf
Download