The Eco-evolutionary Dynamics of Extrachromosomal Elements
in Environmental Vibrio
by
MASSACH41 LSETTS INGTIMlTE
OF TECHNOLOGY
Hong Xue
B.S., Sichuan University (1998)
oc
27
LIB RARIES
Submitted to the Department of Civil and Environmental Engineering
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2014
@2014 Massachusetts Institute of Technology. All rights reserved
Signature redacted
Signature of Author.....................................................
Department of Civil and Environmental Engineering
August 22, 2014
Signature redacted
Certified by..........................
Martin F. Polz
Professor of Civil and Environmental Engineering
Thesis Advisor
Signature redacted
Accepted by........................
20
----------------.---Heidi M. Nepf
Chair, Departmental Committe for Graduate Students
The Eco-evolutionary Dynamics of Extrachromosomal Elements in
Environmental Vibrio
by
Hong Xue
Submitted to the Department of Civil and Environmental Engineering
on August 22, 2014 in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy in Environmental Biology
ABSTRACT
Plasmids and other extrachromosomal elements (ECEs) are recognized as key factors
mediating horizontal gene transfer; however, their diversity and dynamics among
ecologically structured host populations in the wild remains poorly understood. Here we
take a population-genomic approach to determine carriage of different types of ECEs in a
recently established model for ecologically and genetically cohesive bacterial populations,
asking whether different ECE types (i) are primarily associated to host phylogeny or
ecology, (ii) have distinct transfer (and loss) patterns, and (iii) display different
microevolutionary dynamics. We employed two models of environmental bacterial
populations: a Vibrio cholerae population isolated from a coastal brackish pond (Oyster
Pond, Woods Hole, MA), and diverse co-existing Vibrio populations comprising several
species from Plum Island Sound (Ipswich, MA). High frequency (>40%) of a novel
filamentous phage, VCYD, was detected in a collection of 531 isolates of V. cholerae.
VCYD occurs both in the host-genome integrative form (IF) and a plasmid-like replicative
form (RF). The relative frequency of each form differed among isolates from portions of
the pond displaying different salinities, suggesting potential impact of host habitat on the
biology of bacteriophages. Using the second model, we isolated 187 ECEs from 660
isolates previously categorized into 25 different ecologically and genetically cohesive
populations. We identified the following elements: 22 bacteriophages, and 24 conjugative,
38 mobilizable and 103 so-called non-transmissible ECEs. While mobilizable ECEs
require co-occurring conjugative plasmids for successful transfer, non-transmissible ECEs
do not encode any genes for self-transfer. We further found that ECEs were significantly
enriched in free-living cells, suggesting association of ECEs with host environment. The
finding of phage as a major and stable ECE component is surprising and the absence of
any integrase genes suggests that these are lysogens that do not integrate into the host
genome. Finally, our data show that a type of plasmids previously defined as "nontransmissible" appears to be most common among Vibrio ECEs and that they have been
transferred recently and frequently among distantly related populations through
mechanisms yet to be uncovered. Overall, this study suggests a dynamic mobile gene pool
with high turnover among host populations.
Thesis Supervisor: Martin F. Polz
Title: Professor of Civil and Environmental Engineering
ACKNOWLWDGEMENT
I want to express my deepest appreciation to my mentor Dr. Martin F. Polz, for the
inspiration, generosity and patience he extended to me along this long journey that is finally
coming to an end. Martin has never doubted on my ability and always encourages me to
keep learning and never give up. I truly appreciate and value everything Martin has taught
me and am very proud to be Martin's student.
I thank all my thesis committee members, Dr. Edward Delong, Dr. Eric Alm and Dr. Janelle
Thompson, for their valuable advises and support. I also particularly thank my coworker
Dr. Otto Cordero, Dr. Francisco Camas and our collaborator Dr. William Trimble, Dr.
Julien Guglielmini; Dr. Eduardo P. C. Rocha for their enthusiasm in developing pioneering
tools that helped tremendously with this project. Without their efforts, this study could
never reach so far. I sincerely thank my fellow student Katherine Kauffman, for never
hesitating sharing her thoughts about my research and for always telling me to believe in
myself. My gratitude is also extended to my dear colleagues, Dr. Young Boucher, Ms. Yan
Xu, Dr. Dana Hunt, Dr. Sarah Preheim, Michael Cutler and all Polz lab members for
making this lab such a loving family that I am honored to have been a part of.
I would also like to express my special thanks to Dr. Matthew Waldor, Edward H. Kass
Professor of Medicine at Harvard Medical School, for offering me the precious opportunity
to participate in his research. His kindness, generosity and a great sense of humor has
helped gain my confidence.
I am honored to have worked with Mingshu Zhan from the EUROP program, and Rafal
Sledziewski from Research Science Institute 2008 program. They all have shown their
eagerness to explore and to try and it is my honor to have mentored them.
Lastly, I want to thank my parents Yun Chao and Baoping Xue, for never showing any
doubts on my ability, decision and stubbornness and for never stopping me from fulfilling
my dream. I thank my wife, Su Xu, for helping with my thesis and creating a positive
environment at home. I am proud to be the father of my two wonderful boys Yuran and
Roman who make me happy whenever and wherever I go.
-5-
-6-
Table of Contents
Page
Abstract............................................................................................3
Acknowledgements.............................................................................
5
Table of Contents...............................................................................
7
List of Figures.....................................................................................9
List of Tables ....................................................................................
10
Chapter 1
Introduction....................................................................11
Chapter 2
High Frequency of a Novel Filamentous Phage, VCYO, within an
Environmental Vibrio cholerae Population.................................39
Chapter 3
Diversity and Dynamics of Excrachromosomal Elements among
Ecologically-Defined Host Populations..............................................63
Chapter 4
Conclusions and Future Directions..........................................107
-7-
-8-
List of Figures
Chapter One
Figure 1. Comparison of two explanations for unexpected phylogenetic distribution........14
Figure 2. Overview of plasmids and conjugative transfer in the horizontal spread of gene. 17
Figure 3. Schematic models for type I, II and III partitioning system...................22
Figure 4. Schematic view of the genetic constitution of transmissible plasmids (A) and
some essential interactions in the process of conjugation (B)..................26
Chapter Two
Figure 1. Genome organization of VCYD phage..............................................62
Figure 2. Electron micrograph of VCY(D phage particles...................................54
Figure 3. attP site of VCYD and attB site of integration of VCYD into chromosome II of
strain 4A01LW1........................................................................56
Chapter Three
Figure 1. Distribution of ECEs among Vibrio hosts.........................................78
Figure 2. Distribution of the number of ECEs per strain for all Vibrio isolates with at least
one ECE ................................................................................
79
Figure 3. ECE family diversity as a function of family size, ECE size and classes...........82
Figure 4. ECE family distribution across the Vibrio phylogeny..........................
Figure 5. ECE genome cluster network........................................................84
Figure 6. Sequence alignment of ECEs in three representative clusters from the network
analysis....................................................................................86
-9-
83
List of Tables:
Chapter One
Table 1. Summary of characterized Vibrio plasmids...........................................30
Chapter Two
Table 1. List of primers used in this study......................................................48
Table 2. Frequency of the IF and RF of VCY(D phage........................................57
Chapter Three
Table 1. An extend bar codes set beyond Roche Titanium-compatible bar codes kit........89
Table 2. The number, GC content and size of contigs..................................................
90
Table 3. Classification, host and population of ECEs..........................................
94
Table 4. Summary of strains carrying ECEs..................................................95
-
10-
CHAPTER ONE
Introduction
-11-
-12-
1. Chapter One: Introduction
1.1. Horizontal Gene Transfer (HGT)
The determination of evolutionary relationships among microbes was enabled by
comparison of 16S rRNA sequences. These proved useful for the analysis of phylogenetic
relationships among all living organisms (1) since ribosomal RNAs are universally
distributed among all three domains of life: Archaea, Bacteria and Eukaryotes. As a
component of the translation systems, the 16S rRNA maintains high functional constancy
and can be readily sequenced by use of PCR-based methods to amplify the gene (2).
However, increasing availability of sequence data from other genes has shown that these
can reveal dramatically different evolutionary relationships of species, causing incongruent
signals (3) (Figure 1). While other hypotheses were being developed to explain such
incongruence, Smith et al. (1992) proposed that horizontal gene transfer (HGT) might be
an important contributing factor (4). In fact, the very first evidence of HGT was the
observation that virulence determinants could be transferred between pneumococci in
infected mice (5). This was later proven to be the consequence of the uptake of genetic
material through transformation, which we now know is one of the principle HGT
mechanisms; however, HGT was ignored for a long time as an event that occurs only
occasionally under specific conditions.
A breakthrough took place when comparative genomics of bacteria and archaea revealed
that a significant amount of genes in bacteria were acquired from distantly related species
(6, 7). For example, it was suggested that more than 20% of the ORFs originating from the
genome of the bacterium Thermotoga maritima are homologous to archaeal species,
indicating frequent cross-species HGT events (8, 9). Moreover, genomic analyses have
13
-
-
detected several microbial species containing two types of rRNA operons which come from
different origins (10).
a
Species A
b
B
D
C
Species A
B
C
D
Figure 1. Comparison of two explanations for unexpected phylogenetic
distribution. a. The presence of a gene with characteristics that are typical for an
unrelated group can be due to horizontal gene transfer (HGT, arrow). b. An alternative
explanation is an ancient gene duplication (*) followed by differential gene loss (x). The
more sister lineages have only the typical gene, the more independent gene-loss events
must be postulated under this scenario. Gogarten J and Townsend J, Nature Reviews
Microbiology, Volume 3, September 2005 (Reprinted by permission from Nature
Publishing Group).
Horizontal gene transfer can be detected by employing different methodologies such as
atypical nucleotide composition, anomalous phylogenetic distribution, difference in gene
contents among closely related species and incongruent phylogenetic trees (9, 11-13). In
theory, HGT should cause incongruent phylogenetic tress when comparing different genes.
Although other factors should not be ruled out, such as poor data and different species
sampling for different genes, incongruent trees have been the most reliable way for
identifying HGT. For example, Martinez et al detected phylogenetic incongruence between
a tree based on 16S rRNA and on 48 PIB-type ATPase sequences, suggesting an ancient
gene transfer from a member of the [-proteobacteria (14).
-
14-
In addition, atypical gene composition, referred to as compositional bias of codons or
nucleotides (15-17), has also been used to identify HGT. In general, sequences that belong
to the same genome share common patterns in G+C composition, and usage of codons and
oligonucleotides that can be determined by natural selection and mutational bias (18, 19).
Each species has its own characteristic evolutionary path, making it possible to detect HGT
(20). Genes displaying base composition that is not typical of their host strains may suggest
origin from distantly related donor organisms.
HGT often results in appearance of a new gene in a particular species. If a gene is present
in only some strains from one particular species, one might suspect involvement of HGT.
However, such uneven distribution pattern can also be caused by gene loss or rapid
sequence divergence (11). Therefore, distribution patterns alone might not be a reliable
method in assessing HGT. Another method that looks at the patterns of best matches to
different species is advantageous in speed and automatability but at the cost of accuracy,
and is therefore not popularly employed in most cases (21).
1.2. Mechanisms of HGT
HGT can occur through three principal mechanisms: natural transformation, transduction
and conjugative transfer. Natural transformation refers to stable uptake of DNA, including
plasmids and chromosomal DNA fragments, under natural growth conditions (22-24). It
most often occurs when bacteria are exposed to environmental change such as temperature,
nutrient supplies, or experience high cell density. To be stable, the uptake needs to be
followed by integration of the alien DNA into the host genome (25). Newly acquired genes
can cause deleterious effects in the recipient cells that may not survive selection whereas
15
-
-
genes that offer a selective advantage may improve the survival rate and hence expand in
the population.
Transduction is a process where genetic materials are transferred from donor cells to
recipient cells via viral infection. Unlike in transformation, DNA in bacteriophage
transduction is protected during the transfer and no cell-to-cell contact is involved (26, 27).
Depending on the type of DNA material that is transferred, transduction can be referred to
as generalized or specialized. In the former case, bacterial DNA fragments can be randomly
packaged into phage particles. These infective particles can eject bacterial DNA into new
host cells where the DNA might recombine with the chromosome. In specialized
transduction, temperate phage that insert into the host chromosome can excise incorrectly
taking a piece of host genome with them that borders the phage. This can lead to high
transfer and recombination rates of these pieces of DNA (28).
Conjugative transfer involves direct cell-to-cell contact where the DNA released from the
donor cells is channeled through a cell junction into the recipient cell. Conjugative transfer
is often associated with circular DNA-plasmids. As shown in the step-by-step illustration
of conjugative transfer in Figure 2 (29), transfer of plasmids is facilitated by a pore complex
temporarily formed at the tight cell junction. After the transfer process is completed, the
complex collapses until the next transfer occurs. After the plasmids enter the recipient cells,
they may either maintain their independence from their host chromosomes or be integrated
into the host chromosome by recombination. Plasmids that do not become part of the
chromosome may replicate independently using a replication machinery encoded by the
plasmids themselves.
16
-
-
Figure 2. Overview of plasmids and conjugative transfer in the horizontal spread
of genes. In the donor, the events depicted are: a, integration of the plasmid into the
chromosome by recombination between insertion sequence elements; b, movement of
a transposable element through a circular intermediate from the chromosome to the
plasmids; c, initiation of rolling-circle replication at the mating-pair apparatus. In the
recipient cell, the events dedicated are: d, recircularization; e, attack by restriction
endonucleases (scissors); f, replication; g, integration into the chromosome by an
illegitimate Campbell recombination; h, recombination between transferred
chromosomal DNA and the resident chromosome. Thomas C and Nielsen K, Nature
Reviews Microbiology Volume 3, September 2005 (Reprinted by permission from
Nature Publishing Group)
1.3. Plasmids as a vector of HGT
1.3.1. Overview of plasmids and their impact on microbial ecology and
evolution
Plasmids are extra-chromosomal, autonomously replicating genetic units that play
-17-
important roles in the ecology and evolution of microbes and are present in all three
domains of life (30). The diverse characteristics of plasmids, including size, host range,
genetic composition and function, have a significant impact on bacterial diversity, habitat
association and adaptation, and ultimately microbial evolution (31). As detailed above,
many plasmids can be mobile among host cells through conjugation and are thus a key
element of a gene pool that can be rapidly gained and lost from bacterial host populations.
The mosaic nature of plasmids has been widely accepted especially the fact that they may
carry genes that display varied evolutionary ancestry.
During the evolutionary process, plasmids have developed their own replication, transfer
and maintenance system to ensure their survival in the host strains and are therefore
considered 'selfish' genetic elements. Plasmids can harbor a wide variety of genes,
including backbone (encoding transfer, maintenance and incompatibility) and accessory
genes (encoding many different functions, including resistance, detoxification and
metabolism) (32). While the backbone genes are essential for their transfer, maintenance
and incompatibility, the accessory genetic components can offer new features to the host
strains such as enhancing host fitness under specific environmental conditions. However,
the functions of a large portion of plasmid genomes remain unknown.
1.3.2. Structure of plasmids
1.3.2.1.
Replication system
Many plasmids have developed a complete replication system to ensure successful vertical
transmission from the mother to daughter cells. The replication module can determine the
copy number of a plasmid and is essential to survival and likelihood of transmission into
-18
-
the new host cells. Plasmids of narrow host range and broad host range have developed
replication systems that use different types of machineries provided by the plasmids in
association with components produced by the host (33). Initially, information about the
mechanisms of narrow-host-range plasmid replication was largely obtained by examining
representative plasmids that belong to the Enterobacteriaceae. It was very recently found
that, some narrow-host-range plasmids can also replicate in non-enteric bacterial species
(34). Nonetheless, within this group, two subgroups are classified depending on whether a
plasmid-encoded protein for replication (Rep protein) is involved. An example of Repindependent replication is CoEI, a 6.6 kb E. coli plasmid that requires host produced
proteins for initiation of replication such as DNA-dependent RNA polymerase, RNase H,
DNA Pol I, DNA gyrase and topoisomerase but not Rep proteins. In an either uni- or bidirectionally 0-shaped manner (35), the replication process starts by binding of these host
proteins at the origin site (ori) in a 0.6 kb region on the plasmid. In contrast, other
enterobacterial plasmids unrelated to CoEI utilize a replicon containing different structural
components. For example, pSC 101 plasmids, obtained from Salmonella panama, contain
the following four components: a gene coding for the Rep protein, clusters of direct repeats
(iterons), binding sites for the DnaA protein and A+T rich sequences (36). Initiation of
replication involves binding of RepA protein to the iteron region to begin the formation of
a replisome. DnaA protein, produced by the host, then recognizes the oriC site on the
plasmid and recruiting more proteins to form replisome.
In contrast to these narrow host range plasmids, broad-host-range plasmids are capable of
replication and maintenance in diverse, unrelated bacteria and therefore have developed
more complex replication systems (37-39). Plasmids in this category can be classified,
based on their incompatibility, into several large groups such as IncC, IncJ, IncP, IncQ and
-
19-
IncW (40-44). Incompatibility refers to the inability of two or multiple plasmids with the
same replicon system to coexist in the same cells. It is thought that this incompatibility is
based on competition for replication, and that the inferior competitor is eventually lost by
the host cell. The best-studied examples are in the IncQ group (RSF1010, R1162 and
R300B) (45, 46). These are all multicopy plasmids with nearly identical structure but
isolated from different bacterial hosts. In this group, replication initiation requires RepA,
RepB and RepC, the latter of which recognizes the origin and binds to the iterons of the
larger cis region. Their replication does not depend on DnaA as is in the case of pSC101
plasmid, but requires DNA Pol III and gyrase, similar to CoEI plasmids. The presence of
the Rep proteins encoded by plasmids and the independence of the replication from host
produced DnaA protein is an important contributing factor to the broad-host-range
character of these plasmids.
In addition to all the above traditional classification of the replication system, a RNA
dependent system has recently been reported in marine bacteria, represented by the plasmid
pB 1067 of Vibrio nigripulchritudo(47). The ori region in this type of plasmids does not
encode Rep protein to initiate DNA replication, rather it encodes two RNAs - RNA I and
RNA II containing complementary sequences that are transcribed from opposite DNA
strands. RNA I is the smaller RNA that contains about 68 nucleotides and functions as the
negative regulator and is also an essential determinant in plasmid compatibility. The longer
RNA, ranging in size between 250 and 500 nucleotides, is termed RNA II and remains
inactive by forming a complex with RNA I between the single-stranded loop region of each
RNA which therefore inhibit replication. A similar RNA replication mechanism prior to
this discovery has only been observed for ColE 1 and related plasmids from the
-20-
Enterobacteriaceae(35). Replication of this type of plasmids involves RNA II; however,
inhibition of RNA II activity is achieved by binding of an antisense RNA molecule instead
of RNA I in the case of pB1067. Sequence and structure analysis of these RNA revealed
that the two types of replicons employed by pB1067 and COEI plasmid are not related,
indicating that they may have emerged through independent evolution pathways.
In spite of these differences, all plasmids employ a replication system that involves genetic
elements carried in the plasmids themselves as well as proteins produced by the host strains.
The complex combination of these elements involved in the system offers the plasmids
different features, whether they have low or high copy number in the host strain, or whether
they have a restricted or broad range of host. However, they all share one simple goal, to
ensure transmission into the daughter cells.
1.3.2.2.
Maintenance system
In addition to the replication systems, plasmids employ a variety of mechanisms to ensure
their maintenance in the host strains: partitioning systems, postsegregational host killing
systems, and site-specific resolution systems. These mechanisms can be differentially
associated with host species or plasmid types; however, they all serve the same ultimate
goal, to enhance the survival rate of plasmids (48).
Partitioning systems are typically found in low copy number plasmids and are usually small
with relatively simple organization (49). These plasmids share a central mechanism of
using oligonucleotide-driven cytomotive filaments to relocate replicated plasmids (48). To
date, most partitioning systems identified consist of a DNA-binding site (par site), an
adaptor protein, and a nucleotide binding motor protein. The specific function of the
adaptor protein is to recognize the par site. The motor protein, by interacting with the
-21-
adaptor protein DNA complex, helps to distribute plasmids so that each daughter cell
contains at least one plasmid during division. Partitioning systems, depending on the
composition of the motor proteins, can be classified into three categories. Type I encode a
NTPase called ParA and a centromere binding protein (CBP) called parB, which function
together in a "pulling" manner, as illustrated in Figure 3 (48). Type II, the best understood
type, encodes ParM and ParR that function in a "Pushing" style through an insertion
polymerization mechanism. Type III, which has recently been characterized from Bacillus
thuringiensispBtoxins, encodes TubZ and TubR that function together in a "trimming"
style (Figure 3) (48), a mechanism that is distinct from both Type I and II. In brief, TubR
multimer first recognizes the C-termini of TubZ-GTP filaments and the captured TubR is
then transported to the cell pole. Once this complex reaches the cell membrane, the TubZ
filament undergoes a conformational change to release TubR and can be recycled for the
next transportation.
Postsegregational killing systems, also named toxin-antitoxin (TA) systems, ensure
plasmid stability by killing plasmid-free daughter cells (50). So far, toxins characterized in
all classified TA system are proteins, whereas antitoxins can be either proteins or small
molecules. Under normal circumstances, the antitoxin is expressed at a much higher level
than the toxin so that toxin action is inhibited, enhancing the survival rate of the host cells.
-22-
Type I
Type I
"pulling"
Type III
"tramming"
"pushing"
*PWr-NTP
0 PrA-ATP
APtrcm
PC
a II
DPs fI
+i s) TpZe
A
0 P
Cwrnnt Op~ini
Strucur Biolog
Figure. 3. Schematic models for type I, II and III partitioning system. (a) Type I
partition utilizes the host cell nucleoid as a 'track' for NTPase-ATP binding and
polymerization (square). When the NTPase-ATP polymer encounters a ParBcentromere partition complex (shown as a circle), that is, the ParB attached plasmid,
the NTPase activity is activated resulting in dissociation of capping ParA-ADP subunits
(triangles) and polymer retraction. The ParBplasmid is either pulled along in the
retreating ParA polymer or is attracted and diffuses toward the moving polymer. The
ultimate outcome is the dynamic equi-distribution of ParB-plasmids at opposite ends
of the nucleoid. (b) Type II partition uses a pushing or insertional polymerization mode
of segregation. In this model, the dynamically unstable ParM filaments are stabilized
and propagate only when each end is captured by a ParRcentromere partition complex.
The polymer continues to grow upon addition of ParM-ATP or ParM-GTP subunits to
the ParR-ParM +interface. The outcome is redistribution of replicated plasmids to
opposite poles. (c) Type III partition employs a tram mechanism of partition. TubR
binds the centromere serving as a high local concentration of binding sites for the Cterminal flexible domains emanating from treadmilling TubZ filaments. Once captured,
the TubR-plasmid is transported to the cell pole by the treadmilling TubZ filaments.
Upon reaching the membrane the TubZ filament bends, likely dumping its TubRplasmid cargo, and reverses direction. Now traveling in the opposite direction, the
TubZ filament binds another TubR-plasmid cargo and carries it to the opposite pole.
Schumacher M, Current Opinion in StructuralBiology, Volume 39, 2012 (Reprinted
by permission from Elsevier Limited).
If the plasmid is lost during the cell division, the plasmid-free daughter cells will no longer
be protected from toxin action and will be killed by the activated toxin through interfering
with key intracellular biological processes such as translation, cytoskeleton synthesis, cell
membrane and cell wall biosynthesis and replication (51). Depending on the molecular
-23-
nature of the antitoxin as well as the type of interaction with the toxin, TA modules are
classified into 5 different types (Type I to V) (52). The antitoxins of type I and III are both
small non-coding RNAs but differ in the mode of interaction with the toxin. In type I, the
antitoxin down-regulates toxin production by base pair matching with the stable toxin
mRNA (53). As a consequence, toxin mRNA cannot bind to the ribosome, preventing
further translation of the toxin from its mRNA. In contrast, type III systems achieve
suppression of the toxin binding not through inhibiting translation but by directly binding
to toxin proteins (54). Antitoxins in all other 3 classes are small proteins that interact with
the toxin by forming a protein-protein complex (type II), interfering with cytoskeleton
assembly (type IV) (55) and preventing translation of toxin (type V) (56).
In addition to the above systems, plasmids are frequently found to utilize a DNA sitespecific resolution systems to be maintained in the host strains (49). Plasmids can remain
in cells with more than one copy, which increases the chances of plasmids to be replicated.
However, this beneficial trait can also cause instability of the plasmids due to high rate of
formation of dimmers and even multimers through recombination with each other (49).
These multimers can be very unstable and may eventually be lost from the host cells. To
prevent the loss due to multimerization, plasmids make use of a host cell-encoded enzyme
complex that converts multimers into monomers by recognizing a cer site typically found
in plasmids with high copy number (57). Resolution of multimeric forms of plasmids is
facilitated by site-specific recombination that occurs at the duplicated replicon sites
through the enzymatic reaction catalyzed by recombinase. Most bacteria encode their own
recombinase that is specific for the target recombination sites with the exception of a few
species that utilize host-encoded recombination system.
-24-
1.3.2.3.
Conjugation system
In addition to the replication and maintenance systems, some plasmids, also called mobile
conjugative elements (MCE), utilize conjugative systems to allow horizontal transmission
(58). Typically, a full set of conjugative components of a conjugative plasmid contains four
apparatuses: an origin of transfer (oriT), a relaxase, a type IV secretion system and a type
IV coupling protein (T4CP) (59). Plasmids equipped with the full set of components are
identified as self-transmissible or conjugative plasmids, whereas, plasmids equipped with
a minimal set including only the site of origin, a relaxase and one or more nickingaccessory protein are often referred to as mobilizable plasmids (60). Here we focus mainly
on the structural organization of a conjugative plasmid. The process of conjugation is
initiated upon the relaxase recognizing the origin of transfer (oriT), followed by catalytic
cleavage at this site, producing the DNA strand to be transferred. The transportation of the
plasmid into the recipient cell is facilitated by T4SS, a membrane associated complex of
12 to 30 proteins (61). The protein complex forms a mating channel for single stranded
DNA to pass through. The DNA is further released into the recipient cells by the T4CP, a
protein complex that is attached to the inner cell membrane and that interacts with both
T4SS and the secretion substrate (62). It has been postulated that T4CP may function as a
DNA pump during the conjugative transfer (63).
Mobilizable plasmids, which are not self-transmissible because of lacking functions
required for mating pair formation, usually carry genetic elements encoding relaxosome
components and the origin of transfer (oriT), which is a short DNA sequence required in
cis for a plasmid to be conjugatively transmissible (Figure 4). Initiation of DNA transfer
utilizing relaxase follows a similar mechanism to conjugative plasmids. The subsequent
-25-
transfer process relies on conjugative components expressed by co-existing selftransmissible or conjugative plasmids in the same host strain (60).
A
B
mobilizable
Jconjugative
*
MOB
/mni
MPF
Figure 4. Schematic view of the genetic constitution of transmissible plasmids (A)
and some essential interactions in the process of conjugation (B). (A) Selftransmissible or conjugative plasmids code for the four components of a conjugative
apparatus: an origin of transfer (oriT) (violet), a relaxase (R) (red), a type IV coupling
protein (T4CP) (green), and a type IV secretion system (T4SS) (blue). The T4SS is, in
fact, a complex of 12 to 30 proteins, depending on the system (see text). Mobilizable
plasmids contain just a MOB module (with or without the T4CP) and need the MPF of
a coresident conjugative plasmid to become transmissible by conjugation. (B) The
relaxase cleaves a specific site within oriT, and this step starts conjugation. The DNA
strand that contains the relaxase protein covalently bound to its 5 end is displaced by
an ongoing conjugative DNA replication process. The relaxase interacts with the T4CP
and then with other components of the T4SS. As a result, it is transported to the recipient
cell, with the DNA threaded to it. Subsequently, the DNA is pumped into the recipient
by the ATPase activity of the T4CP (Smillie, C, et al. Microbiology and Molecular
Biology Reviews, Volume 74, 2010 (Reprinted by permission from ASM Press).
1.4. Phage with plasmid structure
Bacteriophage are tremendously abundant on earth, with an estimated 10" phage particles
found in the biosphere (64, 65). Phage infection occurs frequently with up to 1023 infections
every second, suggesting that bacteriophage are a highly dynamic biological force (66). To
date, phage sequence information is obtained by different approaches, including analysis
-26-
of laboratory isolated phages, viral metagenomics as well as prophage mining (67). These
methods complement each other and have enriched our knowledge of phage.
Phages carry genetic material in different forms: RNA (68), single stranded (ss) (69) and
double stranded (ds) DNA (70), where the latter appears to comprise the vast majority of
bacteriophage. The genome sizes of these dsDNA phage range from 3 kbp to 500 kbp (67),
and the size of the virus capsule the genome is packaged into can vary accordingly. Because
the genome size of a phage or the amount of DNA packaged into the capsule can directly
determine the virion infectivity, loss or acquisition of new DNA materials can be
immediately influential in bacteriophage evolution (67). Comparative analysis reveals that
different regions of phage genomes display distinct evolutionary history, suggesting that
phage genomes are highly mosaic (71). One possible explanation could be that HGT and
recombination plays an important role.
Some phages demonstrate features very similar to plasmids, which are known to be a
vehicle in HGT. For example, the lambdoid phage N15 of E. coli, unlike many typical
temperate phage, is not integrated into the host chromosomes. Instead, they are extrachromosomal self-replicating DNA with covalently closed ends, a structural organization
typically employed by plasmids (72). In another case, bacteriophage P1 is found to be
maintained as a plasmid prophage with low copy number in host strains (73). P1 phage can
express the recombinase Cre that assists with multiple plasmid maintenance functions such
as resolving plasmid multimers and maintaining low copy numbers. Perhaps, some phages
have evolved to carry their own replication system while maintaining their infection system.
The mosaic nature of these phages, that is similar to that observed in plasmids, may reflect
-27
-
frequent HGT that occurs not only in bacterial population but also in phages.
1.5. Vibrio as a model system
Vibrio (Vibrionaceae)are gram-negative gamma-Proteobacteriathat have long been used
as models for studying heterotrophic processes in the ocean (74, 75). Vibrio are motile,
metabolically and ecologically versatile members of coastal plankton. Cultivationdependent and -independent studies have consistently detected Vibrio with high densities
in and/or on marine macroorganisms, including fish, corals, mollusks, sea grass, shrimp
and zooplankton (75, 76). They have also been found to occur free-living in the water
column as well as associated with various types of organic particles and organisms (77).
While some strains are well-known human pathogens such as V. cholerae, V.
parahaemolyticus, and V. vulnificus, most Vibrio are non-pathogenic or occasional
pathogens of marine organisms.
Several properties of Vibrio make them a good model for studying plasmid diversity:
culturability, taxonomic breadth and ecological diversity. Among all environmental
bacteria, Vibrio are one of the few that can be easily cultured in the laboratory, and isolation
tools established so far have proven to be efficient.
In this study, we performed our analyses on two Vibrio collections previously obtained by
our laboratory from two separate geographical locations in both spring and fall. From the
surface water of Oyster Pond and lagoon in Woods Hole MA, only Vibrio cholerae strains
were collected, whereas, a much higher diversity of Vibrio strains was isolated from the
surface water of Plum Island Estuary in Ipswich MA. Isolation of the cells, classification
as well as further characterization were described in detail in previously published results
(78). In brief, cells isolated from both populations were separated into four size fractions,
each containing microorganisms and organic material and/or larger organisms of different
-28-
origins. Particles are considered to be enriched in zooplankton if larger than 63um, enriched
in organic particles if the size ranges between 5-63, and in larger cells or cells attached to
very small particulate matter if between 1-5um. Clearly free-living cells were obtained if
size range between 0.22 to lum. In collaboration with the Alm lab, our lab developed an
AdaptML model that predicted six habitats from the samples collected, based on the season
and size fraction information of each strain in relation to their individual distribution.
Through further analyses of the six habitats in relation to their phylogenetic relationship,
all strains were grouped into 25 populations.
1.6. ECEs in Vibrio
Studies on the diversity and distribution of ECEs in Vibrio have been primarily focused on
pathogenic strains because of evidence that ECEs contribute to virulence. In addition,
ECEs are also a major player in the spread of antibiotic resistance among Vibrio,
particularly among Vibrio cholerae strains. Nonetheless, some recent studies have shown
that ECEs can be identified in frequently studied Vibrio species including environmental
strains (76).
To date, the complete sequence of 37 plasmids identified from Vibrio has been deposited
in Genbank (Table I). Some strains carry only one plasmid while others carry more than
one type. The size of these plasmids varies considerably, ranging from 2 to 250 kbp. Studies
that focus on pathogenic Vibrio strains have shown that they may require plasmids to cause
diseases in fish and various invertebrates. For example, a 65 kbp plasmid pJM1 detected
in V. anguillarumwas shown to cause fatal hemorrhagic septicemic disease in salmon and
other fish (79). Complete genomic sequence analysis revealed that the genetic components
-
29-
Table 1. Summary of characterized Vibrio plasmids. All data were obtained from
Genbank as of June 25 2014.
Plasmid name
Host
Accession numbers
pVAE259
Vibrio alginolyticus
Vibrio anguillarum
Vibrio anguillarum775
NC_013178
NC_019325
pJv
pJM1
Size (bp)
6075
Topology
5982
65009
circular
circular
circular
89003
circular
circular
circular
circular
unnamed
pVIBHAR
Vibrio campbellii ATCC BAA- 1116
NC_005250
NC_022271
Vibrio campbellii ATCC BAA- 1116
NC_009777
pVCG4.1
pVCG1.2
Vibrio cholerae
Vibrio cholerae
NC_010910
NC_010899
89008
2163
2357
pVCG1.1
Vibrio cholerae
NC_010897
4439
circular
pTLC
Vibrio cholerae
NC_004982
4719
circular
pSIO1
Vibrio cholerae
NC_006860
4906
circular
pVCR94deltaX
Vibrio cholerae
NC_023291
120572
circular
pVC
unnamed
pES100
pMJ100
Vibrio cincinnatiensis
NC_019241
6309
circular
Vibrio coraliilyticus
Vibrioflscheri ES114
Vibriofischeri MJ 11
NC_020451
NC_006842
NC_011185
26631
45849
circular
circular
circular
pBD146
pVCR1
Vibriofluvialis
NC_011797
NC_021808
pVCR1
pSFnl
VIBNI pA
pZY5
Vibrio harveyi
Vibrio harveyi
179459
7472
9615
9615
circular
circular
NC_023279
NC_010733
11237
circular
linear
NC_015156
NC_012859
NC_002088
247271
3504
4839
circular
circular
circular
circular
circular
pSA19
Vibrio nigripulchritudo
Vibrio nigripulchritudo
Vibrio parahaemolyticus
Vibrio parahaemolyticus
unnamed
pO3K6
Vibrio parahaemolyticus
Vibrio parahaemolyticus
NC_021292
NC_002473
7138
8784
pAK1
p09O22A
Vibrio shilonii
NC_010734
13415
linear
Vibrio sp. 09022
Vibrio sp. 0908
Vibrio sp. 23023
Vibrio sp. 41
Vibrio sp. TC68
Vibrio tapetis
Vibrio vulmnficus
Vibrio vulnificus
Vibrio vulnificus
Vibrio vulnificus
Vibrio vulnificus VVybl (BT3)
NC_010114
NC_010113
31036
circular
circular
Vibrio vulnificus YJ016
NC_005128
0908
p
p23023
pPS41
pTC68
pVT1
pMP1
pC4602-1
pC4602-2
pR99
unnamed
pYJ016
-30-
NC_010112
NC_004961
81413
52527
6886
NC_008690
NC_010614
NC_012758
NC_009702
NC_009703
NC_009701
7847
82266
7628
56628
66946
68446
NZCM001801
39190
48508
circular
circular
circular
circular
circular
circular
circular
circular
circular
circular
of pJMI encode proteins involved in the production of a siderophore, a key element in V.
anguillarumpathogenicity. Other plasmids have also been linked to Vibrio virulence such
as an 11.2 kbp plasmid pSFnl in V. nigripulchritudo(80). Vibrio strains possessing this
plasmid can cause diseases in shrimp. Similarly, the coral pathogen V. shiloni was found
to harbor a 13.4 kbp plasmid pAK1 that demonstrates similar genetic composition with
pSFn1 (81). Overall, this suggests that plasmids are an important virulence factor in Vibrio
strains that can cause diseases in a variety of ocean life.
While these studies focused on the identification of plasmids in pathogenic Vibrio strains,
only few reports have addressed the diversity of plasmids in environmental Vibrio strains.
For example, the presence of plasmids has been assessed in several environmental Vibrio
strains, including three coastal strains of V. fluvialis, V. mediterranei and V. campbellii
(82). Plasmids of different sizes were detected in all three strains, ranging from 31 to 81
kbp and all plasmids shared similar G+C contents. Sequence analysis revealed that all three
plasmids encode different proteins, suggesting that genetic organization and transfer
mechanisms of plasmids is diverse even in a small group of Vibrio strains (82). In another
study, several small plasmids were characterized from V. parahaemolyticus;however,
these plasmids only encode hypothetical proteins and their roles in the host are still
unknown. Although these studies suggest that plasmids are fairly frequent in Vibrio and
can transfer potentially beneficial functions to the host, there is little comprehensive
information on diversity,
evolutionary dynamics,
association of plasmids.
-31-
abundance and environmental
1.7. Goals of this thesis
The overall goal of this study is to provide a deeper understanding of plasmid (and other
extrachromosomal element) diversity and their distribution among Vibrio populations in
the environment. We seek to address the following questions: (1) What is the diversity of
the ECE backbone system encoding basic replication, maintenance, incompatibility and
transfer processes? (2) What is the relationship between plasmids and their host and the
potential association of their occurrence with the environment? (3) What is the distribution
pattern and eco-evolutionary dynamics of plasmids among Vibrio populations?
-
- 32
References
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
Woese CR, Kandler 0, Wheelis ML. 1990. Towards a natural system of
organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc NatI
Acad Sci U S A 87:4576-4579.
Woese CR. 1987. Bacterial evolution. Microbiological reviews 51:221-271.
Gogarten JP, Townsend JP. 2005. Horizontal gene transfer, genome innovation
and evolution. Nat Rev Microbiol 3:679-687.
Smith MW, Feng DF, Doolittle RF. 1992. Evolution by acquisition: the case for
horizontal gene transfers. Trends in biochemical sciences 17:489-493.
Griffith F. 1928. The Significance of Pneumococcal Types. The Journal of hygiene
27:113-159.
Nakamura Y, Itoh T, Matsuda H, Gojobori T. 2004. Biased biological functions of
horizontally transferred genes in prokaryotic genomes. Nature genetics 36:760766.
Vocke C, Bastia D. 1983. Primary structure of the essential replicon of the
plasmid pSC101. Proc Natl Acad Sci U S A 80:6557-6561.
Garcia-Vallve S, Romeu A, Palau J. 2000. Horizontal gene transfer in bacterial
and archaeal complete genomes. Genome Res 10:1719-1725.
Ragan MA. 2001. Detection of lateral gene transfer among microbial genomes.
Curr Opin Genet Dev 11:620-626.
Kunnimalaiyaan M, Stevenson DM, Zhou Y, Vary PS. 2001. Analysis of the
replicon region and identification of an rRNA operon on pBM400 of Bacillus
megaterium QM B1551. Mol Microbiol 39:1010-1021.
Eisen JA. 2000. Horizontal gene transfer among microbial genomes: new insights
from complete genome analysis. Curr Opin Genet Dev 10:606-611.
Koonin EV, Makarova KS, Aravind L. 2001. Horizontal gene transfer in
prokaryotes: quantification and classification. Annual review of microbiology
55:709-742.
Ragan MA. 2001. On surrogate methods for detecting lateral gene transfer.
FEMS microbiology letters 201:187-191.
Martinez RJ, Wang Y, Raimondo MA, Coombs JM, Barkay T, Sobecky PA. 2006.
Horizontal gene transfer of PIB-type ATPases among bacteria isolated from
radionuclide- and metal-contaminated subsurface soils. Appl Environ Microbiol
72:3111-3118.
Lawrence JG, Ochman H. 1998. Molecular archaeology of the Escherichia coli
genome. Proc Natl Acad Sci U S A 95:9413-9417.
Gogarten JP, Doolittle WF, Lawrence JG. 2002. Prokaryotic evolution in light of
gene transfer. Molecular biology and evolution 19:2226-2238.
Clarke GD, Beiko RG, Ragan MA, Charlebois RL. 2002. Inferring genome trees by
using a filter to eliminate phylogenetically discordant sequences and a distance
matrix based on mean normalized BLASTP scores. J Bacteriol 184:2072-2080.
33
-
1.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
Medigue C, Rouxel T, Vigier P, Henaut A, Danchin A. 1991. Evidence for
horizontal gene transfer in Escherichia coli speciation. J Mol Biol 222:851-856.
Lawrence JG, Ochman H. 1997. Amelioration of bacterial genomes: rates of
change and exchange. J Mol Evol 44:383-397.
Daubin V, Lerat E, Perriere G. 2003. The source of laterally transferred genes in
bacterial genomes. Genome Biol 4:R57.
Eisen JA. 1995. The RecA protein as a model molecule for molecular systematic
studies of bacteria: comparison of trees of RecAs and 16S rRNAs from the same
species. J Mol Evol 41:1105-1123.
Chen I, Dubnau D. 2004. DNA uptake during bacterial transformation. Nat Rev
Microbiol 2:241-249.
Dubnau D. 1999. DNA uptake in bacteria. Annual review of microbiology 53:217244.
Dubeikovskii AN, Boronin AM. 1990. [Localization in Escherichia coli of
transcribed regions of the broad host range plasmid pBS222]. Molekuliarnaia
genetika, mikrobiologiia i virusologiia:27-29.
de Vries J, Wackernagel W. 2002. Integration of foreign DNA during natural
transformation of Acinetobacter sp. by homology-facilitated illegitimate
recombination. Proc Natl Acad Sci U S A 99:2094-2099.
Wommack KE, Colwell RR. 2000. Virioplankton: viruses in aquatic ecosystems.
Microbiology and molecular biology reviews: MMBR 64:69-114.
Ashelford KE, Day MJ, Fry JC. 2003. Elevated abundance of bacteriophage
infecting bacteria in soil. Appi Environ Microbiol 69:285-289.
Heuer H, Smalla K. 2007. Horizontal gene transfer between bacteria.
Environmental biosafety research 6:3-13.
Thomas CM, Nielsen KM. 2005. Mechanisms of, and barriers to, horizontal gene
transfer between bacteria. Nat Rev Microbiol 3:711-721.
van Elsas JD, Bailey MJ. 2002. The ecology of transfer of mobile genetic
elements. FEMS microbiology ecology 42:187-197.
Johnson TJ, Nolan LK. 2009. Pathogenomics of the virulence plasmids of
Escherichia coli. Microbiology and molecular biology reviews: MMBR 73:750774.
Boltner D, MacMahon C, Pembroke JT, Strike P, Osborn AM. 2002. R391: a
conjugative integrating mosaic comprised of phage, plasmid, and transposon
elements. J Bacteriol 184:5158-5169.
Chattoraj DK, Snyder KM, Abeles AL. 1985. P1 plasmid replication: multiple
functions of RepA protein at the origin. Proc Natl Acad Sci U S A 82:2588-2592.
Rakowski SA, Filutowicz M. 2013. Plasmid R6K replication control. Plasmid
69:231-242.
del Solar G, Giraldo R, Ruiz-Echevarria MJ, Espinosa M, Diaz-Orejas R. 1998.
Replication and control of circular bacterial plasmids. Microbiology and
molecular biology reviews: MMBR 62:434-464.
-34-
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
Ingmer H, Miller C, Cohen SN. 2001. The RepA protein of plasmid pSC101
controls Escherichia coli cell division through the SOS response. Mol Microbiol
42:519-526.
Kolatka K, Kubik S, Rajewska M, Konieczny I. 2010. Replication and partitioning
of the broad-host-range plasmid RK2. Plasmid 64:119-134.
Leao SC, Matsumoto CK, Carneiro A, Ramos RT, Nogueira CL, Junior JD, Lima
KV, Lopes ML, Schneider H, Azevedo VA, da Costa da Silva A. 2013. Correction:
The Detection and Sequencing of a Broad-Host-Range Conjugative IncP-1beta
Plasmid in an Epidemic Strain of subsp. PLoS One 8.
Leao SC, Matsumoto CK, Carneiro A, Ramos RT, Nogueira CL, Lima JD, Jr., Lima
KV, Lopes ML, Schneider H, Azevedo VA, da Costa da Silva A. 2013. The
detection and sequencing of a broad-host-range conjugative lncP-1beta plasmid
in an epidemic strain of Mycobacterium abscessus subsp. bolletii. PLoS One
8:e60746.
Uga H, Matsunaga F, Wada C. 1999. Regulation of DNA replication by iterons: an
interaction between the ori2 and incC regions mediated by RepE-bound iterons
inhibits DNA replication of mini-F plasmid in Escherichia coli. The EMBO journal
18:3856-3867.
van Zyl U, Deane SM, Rawlings DE. 2003. Analysis of the mobilization region of
the broad-host-range IncQ-like plasmid pTC-F14 and its ability to interact with a
related plasmid, pTF-FC2. J Bacteriol 185:6104-6111.
Pembroke JT, Murphy DB. 2000. Isolation and analysis of a circular form of the
Ind conjugative transposon-like elements, R391 and R997: implications for IncJ
incompatibility. FEMS microbiology letters 187:133-138.
Jacquet MA, Ehrlich R. 1985. In vivo and in vitro effect of mutations in tetA
promoter from pSC101: insertion of poly(dA.dT) stretch in the spacer region
does not inactivate the promoter. Biochimie 67:987-997.
Fernandez-Lopez R, Garcillan-Barcia MP, Revilla C, Lazaro M, Vielva L, de la Cruz
F. 2006. Dynamics of the IncW genetic backbone imply general trends in
conjugative plasmid evolution. FEMS microbiology reviews 30:942-966.
Sakai H, Komano T. 1996. DNA replication of IncQ broad-host-range plasmids in
gram-negative bacteria. Bioscience, biotechnology, and biochemistry 60:377382.
Loftie-Eaton W, Rawlings DE. 2012. Diversity, biology and evolution of IncQfamily plasmids. Plasmid 67:15-34.
Le Roux F, Davis BM, Waldor MK. 2011. Conserved small RNAs govern
replication and incompatibility of a diverse new plasmid family from marine
bacteria. Nucleic Acids Res 39:1004-1013.
Schumacher MA. 2012. Bacterial plasmid partition machinery: a minimalist
approach to survival. Curr Opin Struct Biol 22:72-79.
Hsu CC, Chen CW. 2010. Linear plasmid SLP2 is maintained by partitioning,
intrahyphal spread, and conjugal transfer in Streptomyces. J Bacteriol 192:307315.
-35-
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
Ogura T, Hiraga S. 1983. Mini-F plasmid genes that couple host cell division to
plasmid proliferation. Proc Natl Acad Sci U S A 80:4784-4788.
Unterholzner SJ, Poppenberger B, Rozhon W. 2013. Toxin-antitoxin systems:
Biology, identification, and application. Mobile genetic elements 3:e26219.
Guglielmini J, Szpirer C, Milinkovitch MC. 2008. Automated discovery and
phylogenetic analysis of new toxin-antitoxin systems. BMC Microbiol 8:104.
Brantl S. 2012. Bacterial type I toxin-antitoxin systems. RNA biology 9:14881490.
Blower TR, Short FL, Rao F, Mizuguchi K, Pei XY, Fineran PC, Luisi BF, Salmond
GP. 2012. Identification and classification of bacterial Type Ill toxin-antitoxin
systems encoded in chromosomal and plasmid genomes. Nucleic Acids Res
40:6158-6173.
Masuda H, Tan Q, Awano N, Wu KP, Inouye M. 2012. YeeU enhances the
bundling of cytoskeletal polymers of MreB and FtsZ, antagonizing the CbtA
(YeeV) toxicity in Escherichia coli. Mol Microbiol 84:979-989.
Wang X, Lord DM, Cheng HY, Osbourne DO, Hong SH, Sanchez-Torres V,
Quiroga C, Zheng K, Herrmann T, Peti W, Benedik MJ, Page R, Wood TK. 2012. A
new type V toxin-antitoxin system where mRNA for toxin GhoT is cleaved by
antitoxin GhoS. Nature chemical biology 8:855-861.
Tolmasky ME, Colloms S, Blakely G, Sherratt DJ. 2000. Stability by multimer
resolution of pJHCMW1 is due to the Tn1331 resolvase and not to the
Escherichia coli Xer system. Microbiology 146 ( Pt 3):581-589.
Davison J. 1999. Genetic exchange between bacteria in the environment.
Plasmid 42:73-91.
Garcillan-Barcia MP, Francia MV, de la Cruz F. 2009. The diversity of conjugative
relaxases and its application in plasmid classification. FEMS microbiology reviews
33:657-687.
Smillie C, Garcillan-Barcia MP, Francia MV, Rocha EP, de la Cruz F. 2010.
Mobility of plasmids. Microbiology and molecular biology reviews : MMBR
74:434-452.
Alvarez-Martinez CE, Christie PJ. 2009. Biological diversity of prokaryotic type IV
secretion systems. Microbiology and molecular biology reviews : MMBR 73:775808.
Llosa M, Zunzunegui S, de la Cruz F. 2003. Conjugative coupling proteins interact
with cognate and heterologous VirB10-like proteins while exhibiting specificity
for cognate relaxosomes. Proc Natl Acad Sci U S A 100:10465-10470.
Tato I, Matilla I, Arechaga I, Zunzunegui S, de la Cruz F, Cabezon E. 2007. The
ATPase activity of the DNA transporter TrwB is modulated by protein TrwA:
implications for a common assembly mechanism of DNA translocating motors.
The Journal of biological chemistry 282:25569-25576.
Hatfull GF. 2008. Bacteriophage genomics. Curr Opin Microbiol 11:447-453.
Hendrix RW, Hatfull GF, Smith MC. 2003. Bacteriophages with tails: chasing
their origins and evolution. Research in microbiology 154:253-257.
-36-
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
77.
78.
79.
80.
81.
Suttle CA. 2007. Marine viruses--major players in the global ecosystem. Nat Rev
Microbiol 5:801-812.
Hatfull GF, Hendrix RW. 2011. Bacteriophages and their genomes. Current
opinion in virology 1:298-303.
Friedman SD, Genthner FJ, Gentry J, Sobsey MD, Vinje J. 2009. Gene mapping
and phylogenetic analysis of the complete genome from 30 single-stranded RNA
male-specific coliphages (family Leviviridae). J Virol 83:11233-11243.
Werten S. 2013. Identification of the ssDNA-binding protein of bacteriophage T5:
Implications for T5 replication. Bacteriophage 3:e27304.
Ackermann HW. 2007. 5500 Phages examined in the electron microscope.
Archives of virology 152:227-243.
Hendrix RW, Smith MC, Burns RN, Ford ME, Hatfull GF. 1999. Evolutionary
relationships among diverse bacteriophages and prophages: all the world's a
phage. Proc Natl Acad Sci U S A 96:2192-2197.
Ravin NV. 2011. N15: the linear phage-plasmid. Plasmid 65:102-109.
Lobocka MB, Rose DJ, Plunkett G, 3rd, Rusin M, Samojedny A, Lehnherr H,
Yarmolinsky MB, Blattner FR. 2004. Genome of bacteriophage P1. J Bacteriol
186:7032-7068.
Reen FJ, Almagro-Moreno S, Ussery D, Boyd EF. 2006. The genomic code:
inferring Vibrionaceae niche specialization. Nat Rev Microbiol 4:697-704.
Takemura AF, Chien DM, Polz MF. 2014. Associations and dynamics of
Vibrionaceae in the environment, from the genus to the population level.
Frontiers in microbiology 5:38.
Hazen TH, Pan L, Gu JD, Sobecky PA. 2010. The contribution of mobile genetic
elements to the evolution and ecology of Vibrios. FEMS microbiology ecology
74:485-499.
Rivera IN, Souza KM, Souza CP, Lopes RM. 2012. Free-living and planktonassociated vibrios: assessment in ballast water, harbor areas, and coastal
ecosystems in Brazil. Frontiers in microbiology 3:443.
Hunt DE, David LA, Gevers D, Preheim SP, Alm EJ, Polz MF. 2008. Resource
partitioning and sympatric differentiation among closely related
bacterioplankton. Science 320:1081-1085.
Naka H, Dias GM, Thompson CC, Dubay C, Thompson FL, Crosa JH. 2011.
Complete genome sequence of the marine fish pathogen Vibrio anguillarum
harboring the pJM1 virulence plasmid and genomic comparison with other
virulent strains of V. anguillarum and V. ordalii. Infection and immunity 79:28892900.
Walling E, Vourey E, Ansquer D, Beliaeff B, Goarant C. 2010. Vibrio
nigripulchritudo monitoring and strain dynamics in shrimp pond sediments.
Journal of applied microbiology 108:2003-2011.
Reynaud Y,. Saulnier D, Mazel D, Goarant C, Le Roux F. 2008. Correlation
between detection of a plasmid and high-level virulence of Vibrio
nigripulchritudo, a pathogen of the shrimp Litopenaeus stylirostris. Appi Environ
Microbiol 74:3038-3047.
-
37
-
66.
82.
Hazen TH, Wu D, Eisen JA, Sobecky PA. 2007. Sequence characterization and
comparative analysis of three plasmids isolated from environmental Vibrio spp.
AppI Environ Microbiol 73:7703-7710.
-38-
CHAPTER TWO
High Frequency of a Novel Filamentous Phage, VCY(D, within an Environmental
Vibrio cholerae Population
Hong Xue, Yan Xu, Yan Boucher and Martin F. Polz
Reprinted by permission from Applied and Environmental Microbiology
Copyright 2012
ASM Press, Washington DC
Xue, H., Y. Xu, Y. Boucher & M.F. Polz, (2012) High frequency of a novel filamentous
phage, VCY phi, within an environmental Vibrio cholerae population. Appi Environ
Microbiol 78: 28-33
-
39-
-40-
High Frequency of a Novel Filamentous Phage, VCYD, within an
Environmental Vibrio cholerae Population
Hong XueA; Yan XUA*; Yan Boucher &; Martin F. Polz$
Department of Civil and Environmental Engineering, Massachusetts Institute of
Technology, Cambridge, Massachusetts
A
These authors contributed equally to this work.
* Present address: Tong Ji University, Shanghai, P.R.China
& Department of Biological Sciences, University of Alberta, CW405, Edmonton, AB,
T6G 2E9, Canada
$ Corresponding author. Mailing address: Massachusetts Institute of Technology, 48-421,
77 Massachusetts Ave., Cambridge, MA 02139. Phone: (617) 253-7128. Fax: (617) 2588850. E-mail: mpolz@mit.edu
-41-
2. Chapter Two: High Frequency of a Novel Filamentous Phage, VCY4D, within an
Environmental Vibrio cholerae Population
2.1. Abstract
Environmental Vibrio choleraestrains isolated from a coastal brackish pond (Oyster Pond,
Woods Hole, MA) carried a novel filamentous phage, VCYD, which can exist as a hostgenome integrated (IF) and plasmid-like replicative form (RF). Outside the cell, the phage
displays morphology typical of Inovirus with filamentous particles -1.8 pm in length and
7 nm in width. Four independent RF isolates had identical genomes except for 8 single
nucleotide polymorphisms (SNPs) clustered in two regions. The overall genome size is
7,103 bp with 11 putative ORFs, organized into three functional modules (replication,
structure and assembly, and regulation). VCYO shares sequence similarity with other
filamentous phages (including cholera disease associated CTX) in a highly mosaic manner,
indicating evolution by horizontal gene transfer and recombination. VCY(I integrates in the
vicinity of the putative translation initiation factor Suil in chromosome II of V. cholerae. A
screen of 531 closely related host isolates showed that -40% harbored phage with 27% and
13% carrying the IF and RF, respectively. The relative frequency of RF and IF differed
among strains isolated from the pond or lagoon of Oyster Pond suggesting that host habitat
influences the intracellular phage biology. The overall high prevalence within the host
population shows that filamentous phages can be an important component of the
-42
-
environmental biology of V. cholerae.
2.2. Introduction
Filamentous phages of the genus Inovirus are unusual among bacterial viruses in that they
do not lyse host cells when new phage particles are produced. Instead, new virions are
packaged on the cell surface and extruded (24). These virions contain ssDNA that typically
enters new hosts via a variety of pili positioned on the cell surface (26). Inside the host,
inoviruses can persist as a circular, double-stranded replicative form (RF); alternatively,
they can integrate into the host chromosome by a variety of mechanisms, including phageencoded transposases (19) and host-encoded XerC/D (11, 13), which normally resolve
chromosome dimers. Production of new, single-stranded phage DNA can proceed via
rolling circle replication from the RF. The genomes of inoviruses are composed of modules
that encode genome replication, virion structure and assembly, and regulation (3);
additionally, like many other phages, inoviruses can undergo extensive recombination,
often picking up new genes in the process so that they may act as important gene transfer
mechanisms among hosts (7, 9).
Vibrio cholerae, environmental bacteria containing strains capable of eliciting the diarrheal
disease cholera, has become somewhat of a model for studying Inovirus biology and
diversity. This is because an important pathogenicity factor, the cholera toxin (CT), is
encoded and transferred by the filamentous phage CTX(I (21). Infection is mediated by
recognition of a type IV pilus (toxin coregulated pilus) and the phage genome can
irreversibly integrate into the host chromosome at one of two dif sites (difi and dij2), which
are the target of XerC/D-mediated recombination with phage att-sites (attP) and are present
on V. cholerae chromosome 1 and 2, respectively (22). Different variants of CTXI are
specific for either dif1 or d#j2 where they can integrate as single or tandem copies (6). A
-43-
number of additional filamentous phages have been described for V. cholerae, including
VEJ(D (3), VGJO (4), KSF-1A (9), VSK(D (17), VSKKD, fslD (23), fs2(D (8), Vf33D (27)
and 4930 (16). Importantly, it has recently been shown that several filamentous phages
display cooperative interactions, and that a process of sequential infection, involving two
satellite and three helper phages, may have been important in the evolution of V. cholerae
strains associated with the seventh pandemic (11).
Here we characterize a novel filamentous phage, designated VCY4), from an environmental
V. cholerae, population. We also show that VCY(D had a remarkably widespread
distribution in the host population it originated from and that the prevalence of RF vs. IF in
host cells appears to be influenced by host habitat and lifestyle.
2.3. Materials and Methods
2.3.1. V. cholerae isolation and propagation
Vibrio cholerae strains were isolated from surface water of Oyster Pond, Woods Hole, MA,
and its lagoon connecting the pond to the coastal ocean on September 8, 2008. The water
temperature and salinity were 24.5 and 26'C, and 4 and 5 ppt for the pond and lagoon,
respectively. Particle-associated and free-living bacterial populations were collected by
sequential filtration of water samples onto filters with different size cutoffs following the
protocol in. (14). For the largest fraction, which is enriched in zooplankton, three replicate
water samples of -100 L each were filtered through a 63 pm plankton net (Wildlife Supply
Company) and the filtrate collected for strain isolation in the lab. For the remaining 3 size
fractions, 3 replicate 1 L samples, which had been prefiltered to remove the 63 pm fraction,
were collected and transported to the lab for further processing.
In the laboratory, all materials retained on 63 pm filters were homogenized using a tissue
-44-
grinder (VWR Scientific) and vortexed for 20 minutes at low speed. The replicate 1-L water
samples from which the >63 pm fraction had been removed were sequentially filtered
through 5, 1 and 0.2 pm pore size filters where the 63-5 and 5-1 Jm size fractions were
collected using gravity filtration to avoid breakdown of fragile particles. For these, filtration
was repeated with sterile seawater to further remove cells unattached to particles.
Subsequently, all filters were placed into 50 ml conical tubes containing 45 ml sterile
seawater and vortexed for 20 minutes at low speed to break up particles and resuspend
bacterial cells. Supernatants were used for isolation of V. cholerae by concentrating serial
dilutions onto 0.2 pm Supor-200 filters (Pall) using gentle vacuum pressure. These filters
were then placed onto agar plates containing Vibrio selective Thiosulfate Citrate Bile Salts
Sucrose media (BD Difco) with 2% NaCl (marine TCBS). Single colonies were picked and
re-streaked three times by alternating Tryptic Soy Broth (TSB) (BD Bacto) with 2% NaCl
and marine TCBS media to obtain pure strains. For all subsequent
analyses, the stock
cultures were used to avoid unequal treatment of strains. Identification of V. cholerae was
done by partial sequencing of the mdh gene as described in (1). For routine propagation,
strains were grown overnight in Luria-Bertani (LB) broth (Difco) at 25*C in a shaking bath
(180 rpm) overnight. Phage was originally detected as a plasmid- like band in genomic
DNA preparations analyzed on agarose gels.
2.3.2. DNA isolation and sequencing.
DNA was extracted from V. cholerae for sequencing of the replicative, plasmid-like form
(RF) of VCY(D and to determine the insertion site of the integrative form (IF) in the host
chromosome. To obtain RF DNA, plasmid-like genomes were isolated from 2 ml of
overnight culture of V. cholerae strain lOE09PWO2, 1OF04PWO2, 5G03LW63 and
-45-
11H04LW5
using Qiaprep Spin Miniprep kit (Qiagen Inc.). Subsequently, DNA was
electrophoretically separated on 0.8% agarose gels, the bands corresponding to the RF cut
out and purified using gel extraction kits (Qiagen Inc.). RF DNA from strain 1OE09PWO2
was tagged by barcode A6-B 15 (Table 1) while DNA from the remaining three RFs was
combined and tagged with barcode A4-B 14 for Illumina sequencing.
An Illumina sequencing protocol (25) was modified to allow for small plasmid library
preparation as follows. DNA libraries were prepared by shearing about 1 pg RF DNA in a
volume of 50 pl into fragments with average length of -400 bp. This was done using 14
cycles of alternating 30 seconds ultasonic bursts and 30 seconds pauses in a 4*C water bath
in a Bio-Ruptor UCD-200 (Biogenode). The fragments were then end- repaired and
phosphorylated using the End-Repair kit (New England Biolabs). The products were subject
to a ligation reaction with a 10-fold molecular excess of Illumina adapters (Table 1) using
the Quick Ligation kit (New England Biolabs). The ligation product was separated on 1.5%
agarose gels and fragments of 300-500 bp size were purified with 10 pl EB buffer using the
Qiagen MinElute Reaction Cleanup kit (Qiagen Inc.). The fragments were nick translated
with Bst Polymerase (New England Biolabs) in 30 pl final volume. Eight replicate-2 pI
reaction products were used without further purification in PCR amplifications using
Phusion Hot Start High-Fidelity DNA polymerase (New England Biolabs), and reaction
progress was monitored on a Bio-Rad Opticon real-time PCR instrument. The reactions
were stopped in the late logarithmic amplification phase and the DNA from the replicate
reactions pooled. To generate the ready-to-sequence DNA, libraries were subjected to an
additional gel purification step to remove adapter dimers and residual primers. The quality
and size distribution of the DNA libraries were checked by Agilent Bioanalyzer DNA-1000
assays (Agilent Technologies, Inc.). The two libraries were pooled with 34 other libraries
-46-
that had different bar codes for deconvolution post sequencing. The samples were loaded
onto a cluster of Illumina GAIIx sequencer and resultant data were analyzed using the
Illumina pipeline 1.4.0 to generate fastq files. Sequences were reconstructed and annotated
using
NextGen
1.9
(Softgenetics
Inc.)
and
DNAmaster
software
(http://cobamide2.bio.pitt.edu), respectively.
To determine the host chromosomal region of phage insertion in strains 4AO3LW1 and
4BO3LW1, a walking PCR protocol (18,20) was used taking advantage of the fact that the
attPsite is split during insertion of phage into the host chromosome. Biotinylated primers,
PCRwalking-biotin and PCRwalking-biotin-asp (Table 1), facing outwards from the
predicted attP site were designed and used to obtain single-stranded PCR products. In a
typical reaction, 20 ng host DNA containing integrated VCY(D was mixed with 0.5 pmoles
biotinylated primer and 0.5 U Platinum Taq Hi-Fidelity (Invitrogen). Amplification used a
three-step cycling program (94*C for 30 s; 45*C for 30 s; 68*C for 5 min) for 35 cycles.
The extension products were captured on Streptavidin beads (Promega), purified and stored
in 1x terminal deoxynucleotidyl transferase buffer. A polyG tail was added to the purified
extended products by incubation with 4 mM dGTP and 4 U Tdt enzyme (Promega) at 37*C
in a shaking bath (200 rpm) for 2 hours. The polyG tailed products were made double
stranded by using the PCRwalking-anchor-C12 and PCRwalkingnest-asp primers (Table
1). The PCR products were separated on a 1% agarose gel and fragments 2-4 kb in size were
purified using the Qiaquick gel extraction kit (Qiagen). The purified DNA fragments were
re-amplified
with
primers
PCRwalking-nest
and
PCRwalkinganchor
-47
-
PCRwalking-anchor-C12 but lacking the run of 12 C) (Table 1).
(as
Table 1. List of primers used in this study
Primer
Sequence
Reference
ilumina adapter A4 up
5'/5AmMC6/ACACTCTTTCCCTACACGACGCTCTTCC
GATCTGCAGG-3'
5'CCTGCAGATCGGAAGAGCGTCGTGTAGGGAAAG
AGTGTAC/3AmM/-3'
5'/5AmMC6/ACACTCTTTCCCTACACGACGCTCTTCC
GATCTAATTC-3'
5'GAATTAGATCGGAAGAGCGTCGTGTAGGGAAAG
AGTGTAC/3AmM/-3'
5'TACTGAGATCGGAAGAGCGGTTCAGCAGGAATG
CCGAGC/3AmM/-3'
5'/5AmMC6/CTCGGCATTCCTGCTGAACCGCTCTTCC
GATCTCAGTA-3'
5'AGCAGAGATCGGAAGAGCGGTTCAGCAGGAATG
CCGAGC/3AmM/-3'
5'/5AmMC6/CTCGGCATTCCTGCTGAACCGCTCTTCC
GATCTCTGCT-3'
5'AATGATACGGCGACCACCGAGATCTACACTCTTT
CCCTACACGACGC
TCTTCCGATCT-3'
This study
Illumina adapter A4 down
Illumina adapter A6 up
Illumina adapter A6 down
Illumina adapter B 14 up
Illumina adapter B 14 down
Illumina adapter B 15 up
Illumina adapter B 15 down
Illumina _amp_1
This study
This study
This study
This study
This study
This study
This study
(25)
5'AAGCAGAAGACGGCATACGAGATCGGTCTCGGC
ATTCCTGCTGAAC
CGCTCTTCCGATCT-3'
(25)
VCYint_F
VCYint_R
5'-TTAACATTGTCAAATGATAAATATG-3'
5'-ATAATCAACTGATAATGTTGCAAAC-3'
This study
This study
PCRwalking-biotin
PCRwalking-biotin-asp
5'-biotin-CAACACAGCCCATTATTfTAGCCCC-3'
5'-biotin-CATTTCACCATTTTATATTGCGCGT-3'
This study
This study
PCRwalkingjbiotininest
5'-CATTTCACCATTATATTGCGCGT-3'
This study
PCRwalking-biotin-nestasp
5'-TCTGAACTGTTAGACGCCTACAAAA-3'
This study
PCRwalking-anchor-C12
5'CCACGCGTCGACTAGTAATTCCCCCCCCCCCCDN
-3'
This study
PCRwalking-anchor
5'-CCACGCGTCGACTAGTAATT-3'
This study
VCY(_Seq_2
5'-ATATCAATGCTTTGCGGTGGTCTAG-3'
This study
VCYbSeqI
5'-TCGATTCATTGTTAAAACTCCCAAAATCG-3'
This study
Illumina _amp_2
-48-
The DNA was purified again by agarose gel and Qiaquick gel extraction kit, and was
sequenced using the Sanger method with either primers VCYODseq_1 or VCY4_seq_2
(Table 1).
To test whether the phage DNA was in single stranded form, DNA from phage particles
was isolated as described by Faruque et.al (2005) (9) and digested with DNase I.
2.3.3.
PCR-based phage identification and screen for RF or IF in host cells.
Because different strains were used for sequencing and electron microscopy of phage, and
to ensure that RF and IF are similar phage, we devised specific PCR primers targeting a
gene (ORF9) currently unique to VCY(I (Table 1).
To identify host isolates containing the RF and/or IF, a PCR protocol was devised that can
differentiate either form. This was achieved by designing one set of primers flanking the
attB site in the bacterial chromosome (primers VCY(IintF and VCY(I-intR; Table 1),
which is split during integration, so that these primers only yield a product for strains not
carrying the IF of VCY. Similarly, a second set of primers flanking the attP site of the
phage (primers VCY(DSeq_1 and VCY4ISeq_2; Table 1) was used to confirm the
presence of RF of VCY(D. For identification of the IF, we used the set of primers
VCYint_F and VCYISeq_1 or VCY&_Seq_2, which can produce a 190 bp or 150 bp
PCR product if the VCY(I is integrated into the attB site.
In a typical reaction, 20 ng genomic DNA or 2 p 1:10 diluted cultured strains were used as
template and mixed with 0.4 pM RF-specific or IF specific primers and polymerase mixture
provided from Qiagen HotStarTaq Master Mix Kit using the three- step cycling program
initial denaturation at 95*C for 15min, 30 cycles of three-step procedure including:
denaturation at 94*C for 30s, annealing at 52'C for 30s and extension at 72'C for 30s; final
-49-
extension at 72*C for 5 min. The PCR products were separated on a 1.8% agarose gel
prepared with 0.5 x TBE buffer.
2.3.4. Electron microscopy
To prepare phage for electron microscopy (EM), V. cholerae strains 7D07PW5, which
carries the RF of VCYD, was grown overnight in 100 ml LB medium at room temperature
in a shaking water bath (180 rpm). The supernatant containing phage was collected by
centrifuging the culture at 8,000 x g for 15 min and subsequent filtering through 0.22 pm
pore size filter. A 100-sl aliquot of the filtered supernatant was spread on a LB agarose
plate for sterility assurance. To precipitate the phage particles, NaCl and polyethylene
glycol 6,000 were added to the filtrate to final concentrations of 2.5 and 5%, respectively.
The mixture was incubated on ice for 30 min, followed by centrifugation at 13,000 x g for
30 min. The phage-containing pellet was collected and resuspended in 500 pl phosphate
buffered saline.
For EM, purified phage particles were negatively stained with 4% (w/v) uranyl acetate and
mounted on freshly prepared Formvar grids. Phage samples were photographed under a FEI
Technai Spirit Transmission Electron Microscope. The average length and width of the
phage were determined from six individual particles.
2.3.5.
Nucleotide sequence accession numbers.
The genome sequence of VCY(D from strain 10E09PWO2 has been deposited in GenBank
with accession number JN848801.
-so-
2.4. Results and Discussion
2.4.1. Characterization of the replicative form (RF) of filamentous phage
VCY(b.
Among a collection of 531 environmental V. cholerae isolates from Oyster Pond, 77
contained putative episomal elements when screened by gel electrophoresis. Seven strains
contained episomal elements of variable sizes; however, seventy appeared similar in size.
Restriction endonuclease analysis for a subset of 10 of these elements using BamHI, EcoRI
and PstI revealed identical patterns, suggesting a closely related, double-stranded plasmidlike element of approximately 7 kbp in size (data not shown). Subsequent genome
sequencing of 4 of these 7kbp plasmids suggested them to be the replication form (RF) of a
new filamentous phage, which we call VCY(D (Figure 1).
The whole genome of VCYD phage consists of 7,103 nucleotides (Figure 1) with a G+C
content of 41.8 mol %. Among the 4 sequenced genomes, only 8 SNPs, clustered in two
regions, were evident: 3 SNPs in a 7 bp stretch (SNP-A in Figure 1) and 5 SNPs in a 14 bp
stretch (SNP-B in Figure 1). Overall, the phage contains 11 open reading frames (ORFi to
ORF 1), predicted by Blast search. Of these ORFs, 9 are homologous to protein coding
genes previously reported from other filamentous phages, including KSF-1(D (9) and VGJI
(4). Based on similarity in sequence and organization to these other phages, the ORFs of
VCYD can be classified into functional modules for replication, structure- assembly, and
regulation (Figure 1).
The putative replication module, composed of ORFi to ORF3 (Figure 1), maps to the same
position as rstA and rstB in CTXD (8) and as gII and gVin M13 phage (2). ORF2 and ORF3
share amino acid sequence similarity with the protein of potential phage replication genes
-51-
and genes of potential ssDNA-binding proteins (26). We therefore suggest that ORF2 and
ORF3 play similar roles in VCYO. A hypothetical gene, ORF1 is associated with the
replication module based on its map position and the overlap of its stop codon with ORF2;
however, its function remains unknown.
0
ORFI
ORF2
F4
S P-B
OR 10 attP
ORF6
ORF3 ORF5
ORF7
ORF8
ORF9
SNP-A
F1
VCY0
7103 bp
CD3
CDI
KSF-10
4L
CD2
CD4
CD7
CD5
CD6 D8
CD9
CD10
CD11 CD12
7107
bp
CD2 CD4 CD5
CD1
CD3
=12
CD6
CD7
VGJ:
CD8
CD9
CD10 CD11
CD13
7542 bp
Figure 1. Genome organization of VCYO phage. Linear ORF maps of VGJ,
KSF-1A0 and VCYD phage were aligned based on their modular structure. ORFs
or genes are represented by arrows oriented in the direction of transcription. Black,
white and light grey arrows represent replication, structural and assembly, and
regulation modules, respectively. Dark grey arrows represent unknown ORFs. The
attP sequences of VGJ and two SNP regions (SNP-A and SNP-B) are also
indicated.
The putative structural and assembly module consists of ORF4 to ORF8 (Figure 1), each
sharing similarity in size, genome position and sequence similarity with the corresponding
capsid proteins of other filamentous phage (3, 4, 8, 9). For instance, ORF6 exhibits similar
size and genome position to gIII of CTXD, which encodes a minor capsid protein pIII that
52
-
-
recognizes and interacts with receptors and coreceptors. The protein encoded by ORF8 is
similar to the pI protein of LF phage of Xanthomonas campestris (5) and the Zot protein of
CTXD, which are both required for viral particle packaging and secretion.
ORF10 and ORF 11 likely encode regulatory proteins constituting the third module. Both
ORFs are oriented in opposite direction to the rest of the ORFs and exhibit homology to
ORF136 and ORF154 of VGJD (4) that encode a potential regulatory and repressor protein,
respectively. Finally, ORF9 is a conserved hypothetical protein whose function has not been
established. Its location between the structure/assembly and regulation modules is the same
as ctxA and ctxB of the CTX(D. However, ORF9 does not share homology with these genes,
which code for the cholera toxin (CT), an important pathogenicity determinant in the
diarrheal disease cholera. Based on these comparisons, it seems likely that ORF9 is not
associated with any of the three modules and may provide additional but currently unknown
function to the phage.
As in other sequenced filamentous phages from Vibrio strains (4, 9, 10), the three modules
of VCYD phage appear to be evolutionary mosaics assembled by horizontal gene transfer.
At the nucleic acid level, the structure and assembly module of VCYD phage and
filamentous phage KSF-1D from V. cholerae share 76% identity (9) while the regulatory
module of VCYD phage displays only low similarity to that of KSF-1(F; instead it is 80%
similar to the corresponding module of phage VGJ(I phage (4). The hybrid genome of
VCYF phage thus confirms that horizontal gene transfer, possibly by co-infection, is a
significant driving force in the evolution of filamentous vibriophages.
53
-
-
2.4.2. Characterization of the viral particle
A filamentous phage structure was detected by electron microscopy in precipitates obtained
from filtered supernatant of strain 7D07PW5, which had been shown to contain the 7 kbp
plasmid-like structure in the gel assay. These phage-like particles were 1.762 0.016 pm in
length and 7 nm in width (n=6) (Figure 2). The size of VCY(D is similar to those typically
found in the genus Inovirus (28) (0.8-2 pm in length and 6-7 nm in width) including fs20D
(15) and KSF-1(D (9).
A
B
Figure 2. Electron micrograph of VCY4D phage particles. Phage particles were
isolated from the culture supernatant of strain 7D07PW5. Both of bars inside the
pictures are 100 nm.
-
54-
2.4.3.
Comparison of RF and IF by PCR screen.
To gain further confidence that the RF and IF of the phage detected in host cells represents
the same phage, we used a specific PCR assay targeting ORF9 (Table 1), which is currently
unique to VCYD. This gave positive results for all 10 strains assayed, including those used
for sequencing and electron microscopy. Together with the PCR assays differentiating IF
and RF, this suggests that the phage present in the Oyster Pond isolates were of a highly
similar nature.
2.4.4. Characterization of the integration site of VCYO.
Because several filamentous phages have been shown to integrate into the V cholerae
chromosome, we investigated such ability in VCYO. We first identified a putative 28
nucleotide-long attP site by comparison with VGJD (4) since this phage has a regulation
module distinct from VCYD by only 20% nucleotide differences (Figure 3). This
comparison also revealed potential binding sites for XerC and XerD, which mediate phagehost recombination (22). The XerC site of VCYD phage differs from that of VGJ
phage
in four nucleotides whereas the XerD sites are identical. Similar to VGJ(I phage, the XerC
and XerD sites of VCYO phage are also intervened by 7 nucleotides. The attP sequences
of some filamentous phage from Vibrio are homologous, suggesting the attP structure is
important for recognition and recombination by XerC and XerD.
To characterize the phage integration site of the host chromosome (attB), we first developed
a PCR screen to distinguish V. cholerae strains containing either the IF or RF of the VCYD
phage. Strain 4A01LW1, for which this analysis detected an integrated phage, was chosen
for further characterization of the chromosomal location of the attB site using walking PCR.
55
-
-
Sequence analysis of PCR products and comparison with the genome sequence of V.
cholerae 01 N16961 (12) identified a putative attB site, 28 bp long and identical to the site
of difi of V. cholerae 01 N16961 except for a single nucleotide position (Figure 3A).
However, unlike the difi site of V. cholerae 01 N16961, which is located on chromosome
I, the 1,054 bp region flanking the attB site in strain 4AO1LW1 shared sequence similarity
(84%) with chromosome II. In this strain, the attB site is located between a putative
transposase and suil, which encodes a translation initiation factor (Figure 3B). This
suggests VCY(I phage, like other filamentous phage in Vibrio strains, uses the XerC and
XerD recombination system to integrate into chromosomal dif-like sequences.
XerC
XerD
.
A
attP-VCYO
AGTACATATTATGTGTCGTrATGGTAAAA
attP-VGJ
ACTCGCATTATGTCGGCTrTATOGTAAAA
dif1 (v.cholerae)
AGTGO4GCATrATGTATG-tTATGTTAAAT
altB-VCYO
ACTGCGCATTATGTATG-TTAT1GTAAAT
B
aft i site
Identify
-I% Io ISI4 tranIvmw
N*N ientitv to siI
Figure 3. attP site of VCYI and attB site of integration of VCY4I into chromosome
II of strain 4AO1LW1. (A) Sequence alignment of the attP regions of the VCY(I and
VGJO (AY242528), and of the attB regions of strain 4AO1LW1 and difi of V.cholerae
N16961. (B) Schematic representation of the integration region of chromosome II of
strain 4AO1LWl. The attB sequences region is also indicated.
56
-
-
2.4.5. Distribution of VCY4 across the environmental V. cholerae
population.
Using data from gel analysis and the PCR-based screens for either RF or IF, we further
investigated the frequency of the two forms of the VCY(D phage across a large collection of
V. cholerae isolates from Oyster Pond. This showed that 220 (41.4%) of a total of 531
isolates contained either the RF or IF of VCYD phage (Table 2), and seventy (13.2%) of
531 strains only carried the 7 kbp RF of VCYD phage as suggested by gel electrophoresis
(Table 2). Because in none of these strains the IF was detectable by IF-specific PCR assay,
the RF of VCYO phage appears to be able to replicate without integration into the
chromosome. Overall, this suggests remarkably high prevalence of this phage within this
environmental V. choleraepopulation.
Table 2. Frequency of the IF and RF of VCY0 phage in a collection of 531 V cholerae
isolates from coastal Oyster Pond (MA) and its lagoon.
No. of
isolates
No. of
isolates with
No. of strains
with IF2
Total no. of strains
containing VCY0
phage
RF'
Pond
360
59 (16.4%)
79 (21.9%)
138(38.3%)
Lagoon
171
11(6.4%)
71 (41.5%)
82 (48.0%)
Total
531
70 (13.2%)
150 (28.2%)
220 (41.4%)
The RF was detected as a 7 kbp, plasmid-like band by agarose gel electrophoresis and
by an additional RF-specific PCR assays followed the IF-specific PCR assays.
2 The IF was identified by IF-specific
PCR assays.
Although it is impossible to know how exactly initial isolation has affected transition
between RF and IF or loss of the phage, we suggest that the numbers provided represent
-
57-
lower bound of the frequency of the phage in this environmental V. cholerae population.
We found that after regrowth of strains from liquid stock cultures, 14% of strains had lost
the RF but in only one strain had the RF transitioned to IF. Moreover, only a single loss of
the IF was observed when 2 strains (4A01 and 4B03) were streaked from liquid media and
a total of 39 single colonies were assayed. This suggests that the RF and IF are moderately
stable when strains are propagated but also indicates that an even higher portion of strains
in the environmental population may have been harboring phage.
Another 150 (28.2%) of the 531 host isolates carried the IF of VCY(D phage detectable by
IF-specific PCR screen suggesting high prevalence of the integrated phage in the V.
choleraepopulation. Although none of these strains showed a visible 7-kbp band by agarose
gel electrophoresis, 113 also gave positive results with the RF-specific PCR screen (Table
2). Because lysogen repression is never absolute, it is possible that a small subpopulation in
each culture tube had transitioned from the [F to the RF. Alternatively, a subpopulation in
each culture tube had transitioned from IF to RF. These host strains were therefore scored
as containing IF only but the presence of small amounts of RF underscores that the exact
proportions of strains containing RF and IF may have shifted during culturing. We therefore
only stress trends in populations from different habitats based on the assumption that these
should be unaffected by transitions between the two forms during regrowth of strains, which
was highly standardized (three streaks post isolation with subsequent analyses carried out
from freezer stock cultures).
The overall frequency of the phage (RF and IF) was similar among host isolates from lagoon
and pond (Table 2); however, they displayed different trends in the presence of IF and RE.
While in the pond the frequency of IF (22%) and RF (16%) were roughly equal, in the
58
-
-
lagoon, the vast majority was in the IF (42%) rather than the RF (6%) (Table 2). As detailed
above, we deem it unlikely that such difference might arise post isolation of host strains, so
that environmental factors may play a role in transition between RF and IF. The population
in the lagoon may thus have been producing less phage particles than its equivalent in the
pond; however, we emphasize that such observation will have to be verified by cultureindependent methods in the future.
In summary, we have described a novel filamentous phage infecting V. cholerae, adding to
the already considerable number of such phages in this species. We show that this phage
had a surprisingly high prevalence in environmental host populations when sampled in late
summer and that the transition between IF and RF may be influenced by environmental
factors. Overall, filamentous phages appear to be an important factor in the environmental
biology of V. cholerae and can affect a large fraction of the cells within a population.
2.5. Acknowledgements
This work was supported by grants from the National Science Foundation Evolutionary
Ecology program, and the National Science Foundation and National Institutes of Health
co-sponsored Woods Hole Center for Oceans and Human Health, the Moore Foundation
and the Department of Energy to MFP, as well as postdoctoral fellowships from the MITMerck alliance to YB. YX would like to acknowledge support from the Chinese Scholarship
Council during her stay at MIT.
59
-
-
References:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
Boucher, Y., 0. X. Cordero, A. Takemura, D. E. Hunt, K. Schliep, E. Bapteste, P.
Lopez, C. L. Tarr, and M. F. Polz. 2011. Local mobile gene pools rapidly cross
species boundaries to create endemicity within global Vibrio cholerae populations.
mBio 2:e00335-00310.
Calendar, R. 1988. The Bacteriophages. Plenum Press, New York.
Campos, J., E. Martinez, Y. lzquierdo, and R. Fando. 2010. VEJ$, a novel
filamentous phage of Vibrio cholerae able to transduce the cholera toxin genes.
Microbiology 156:108-115.
Campos, J., E. Martinez, E. Suzarte, B. L. Rodriguez, K. Marrero, Y. Silva,T. Ledon,
R. del Sol, and R. Fando. 2003. VGJ phi, a novel filamentous phage of Vibrio
cholerae, integrates into the same chromosomal site as CTX phi. J Bacteriol
185:5685-5696.
Chang, K. H., F. S. Wen, T. T. Tseng, N. T. Lin, M. T. Yang, and Y. H. Tseng.1998.
Sequence analysis and expression of the filamentous phage phi Lf gene I encoding
a 48-kDa protein associated with host cell membrane. Biochem. Biophys. Res.
Commun. 245:313-318.
Das, B., J. Bischerour, and F. X. Barre. 2011. Molecular mechanism of acquisition
of the cholera toxin genes. Indian J Med Res 133:195-200.
Davis, B. M., and M. K. Waldor. 2003. Filamentous phages linked to virulence of
Vibrio cholerae. Curr Opin Microbiol 6:35-42.
Ehara, M., S. Shimodori, F. Kojima, Y. Ichinose, T. Hirayama, M. J. Albert,K.
Supawat, Y. Honma, M. Iwanaga, and K. Amako. 1997. Characterization of
filamentous phages of Vibrio cholerae 0139 and 01. FEMS Microbiol Lett
154:293-301.
Faruque, S. M., I. Bin Naser, K. Fujihara, P. Diraphat, N. Chowdhury, M.
Kamruzzaman, F. Qadri, S. Yamasaki, A. N. Ghosh, and J. J. Mekalanos. 2005.
Genomic sequence and receptor for the Vibrio cholerae phage KSF-1phi:
evolutionary divergence among filamentous vibriophages mediating lateral gene
transfer. J Bacteriol 187:4095-4103.
Faruque, S. M., and J. J. Mekalanos. 2003. Pathogenicity islands and phages in
Vibrio cholerae evolution. Trends Microbiol 11:505-510.
Hassan, F., M. Kamruzzaman, J. J. Mekalanos, and S. M. Faruque. 2010. Satellite
phage TLCphi enables toxigenic conversion by CTX phage through dif site
alteration. Nature 467:982-985.
Heidelberg, J. F., J. A. Eisen, W. C. Nelson, R. A. Clayton, M. L. Gwinn, R. J.
Dodson, D. H. Haft, E. K. Hickey, J. D. Peterson, L. Umayam, S. R. Gill, K. E.
Nelson, T. D. Read, H. Tettelin, D. Richardson, M. D. Ermolaeva, J. Vamathevan,
S. Bass, H. Qin, I. Dragoi, P. Sellers, L. McDonald, T. Utterback, R. D. Fleishmann,
W. C. Nierman, 0. White, S. L. Salzberg, H. 0. Smith, R. R. Colwell, J. J.
Mekalanos, J. C. Venter, and C. M. Fraser. 2000. DNA sequence of both
chromosomes of the cholera pathogen Vibrio cholerae. Nature 406:477-483.
-60-
13.
Huber, K. E., and M. K. Waldor. 2002. Filamentous phage integration requires the
host recombinases XerC and XerD. Nature 417:656-659.
14.
Hunt, D. E., L. A. David, D. Gevers, S. P. Preheim, E. J. Alm, and M. F. PoIz. 2008.
Resource partitioning and sympatric differentiation among closely related
bacterioplankton. Science 320:1081-1085.
Ikema, M., and Y. Honma. 1998. A novel filamentous phage, fs-2, of Vibrio
cholerae 0139. Microbiology 144 (Pt 7):1901-1906.
Jouravieva, E. A., G. A. McDonald, C. F. Garon, M. Boesman-Finkelstein, and R.
A. Finkelstein. 1998. Characterization and possible functions of a new
filamentous bacteriophage from Vibrio cholerae 0139. Microbiology 144 (Pt
2):315-324.
Kar, S., R. K. Ghosh, A. N. Ghosh, and A. Ghosh. 1996. Integration of the DNA of
a novel filamentous bacteriophage VSK from Vibrio cholerae 0139 into the host
chromosomal DNA. FEMS Microbiol Lett 145:17-22.
Katz, L. A., E. A. Curtis, M. Pfunder, and L. F. Landweber. 2000. Characterization
of novel sequences from distantly related taxa by walking PCR. Mol Phylogenet
Evol 14:318-321.
Kawai, M., I. Uchiyama, and I. Kobayashi. 2005. Genome comparison in silico in
Neisseria suggests integration of filamentous bacteriophages by their own
transposase. DNA Res 12:389-401.
Luo, P., T. Su, C. Hu, and C. Ren. 2011. A novel and simple PCR walking method
for rapid acquisition of long DNA sequence flanking a known site in microbial
genome. Mol Biotechnol 47:220-228.
McLeod, S. M., H. H. Kimsey, B. M. Davis, and M. K. Waidor. 2005. CTXphi and
Vibrio cholerae: exploring a newly recognized type of phage-host cell relationship.
Mol Microbiol 57:347-356.
McLeod, S. M., and M. K. Waidor. 2004. Characterization of XerC- and XerDdependent CTX phage integration in Vibrio cholerae. Mol Microbiol 54:935-947.
Nakasone, N., Y. Honma, C. Toma, T. Yamashiro, and M. Iwanaga. 1998.
Filamentous phage fsl of Vibrio cholerae 0139. Microbiol Immunol 42:237-239.
Rakonjac, J., J. Feng, and P. Model. 1999. Filamentous phage are released from
the bacterial membrane by a two-step mechanism involving a short C-terminal
fragment of pill. J Mol Biol 289:1253-1265.
Rodrigue, S., A. C. Materna, S. C. Timberlake, M. C. Blackburn, R. R. Malmstrom,
E. J. Alm, and S. W. Chisholm. 2010. Unlocking short read sequencing for
metagenomics. PLoS One 5:e11840.
Stassen, A. P., R. H. Folmer, C. W. Hilbers, and R. N. Konings. 1994. Singlestranded DNA binding protein encoded by the filamentous bacteriophage M13:
structural and functional characteristics. Mol Biol Rep 20:109-127.
Taniguchi, H., K. Sato, M. Ogawa, T. Udou, and Y. Mizuguchi. 1984. Isolation and
characterization of a filamentous phage, Vf33, specific for Vibrio
parahaemolyticus. Microbiol Immunol 28:327-337.
Welsh, L. C., D. A. Marvin, and R. N. Perham. 1998. Analysis of X-ray diffraction
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
-61-
from fibres of Pfl Inovirus (filamentous bacteriophage) shows that the DNA in the
virion is not highly ordered. J Mol Biol 284:1265-1271.
62
-
-
CHAPTER THREE
Diversity and Dynamics of Extrachromosomal Elements among EcologicallyDefined Host Populations
-63-
-64-
3. Chapter Three: The Eco-evolutionary Dynamics of Extrachromosomal elements
among Ecological Populations of Vibrionaceae
3.1. Abstract
Although plasmids and other extrachromosomal elements (ECEs) are recognized as key
players in horizontal gene transfer, their diversity and dynamics among ecologically
structured host populations in the wild remains poorly understood. Here we characterized
187 ECEs from 660 Vibrio isolates previously categorized into 25 ecologically and
genetically cohesive populations from a coastal environment (Plum Island Sound, Ipswich,
MA). ECEs are unevenly distributed among host populations and occur at higher frequency
in free-living cells, suggesting influence of host lifestyle on ECE carriage. We detected 22
temperate but non-integrated bacteriophages, and 24 conjugative, 38 mobilizable and 103
non-transmissible plasmids, which all differ in their putative mode and probability of
transmission. Based on sequence similarity, phages and plasmids were assigned to 2 and
29 different families, respectively. Many of the plasmid families contain low sequence
diversity but are widespread among host populations, indicating high eco-evolutionary
turnover. This includes non-transmissible ECEs suggesting that these are misnamed and
transferred by currently unknown mechanisms. Finally, analysis of recent gene transfer
among ECEs suggests that plasmids are highly recombinogenic and represent an extensive
network of gene transfer that has implications for horizontal gene transfer among distantly
related host.
65
-
-
3.2. Introduction
With few exceptions, plasmids and other extrachromosomal elements (ECEs) have been
studied with a purpose, i.e., as major conduits for the spread of resistance and virulence
genes. Only recently have whole genome sequencing of microbial hosts and direct
extraction of ECEs from environmental samples provided a more unbiased glimpse at their
large and functionally often uncharacterized diversity (1, 2). For plasmids, in particular,
this has demonstrated the existence of large numbers of types that have been broadly
categorized into conjugative, mobilizable and non-transmissible based on the presence (or
absence) of functional genes related to the ability to transfer between host cells (3, 4).
Much, however, remains to be learned about the ecological and evolutionary dynamics of
these different types of plasmids, such as their host range, nucleotide and gene content
diversity, and frequency and persistence within host populations in the wild (5). Even less
well studied are temperate phages that can manifest as ECEs, replicating as plasmid-like
structures during the lysogenic phase of their lifecycle. Examples of such phage ECEs are
some Tectiviridae (6), which have been found as linear plasmids in Bacillus species, and
phage N15, which is a relative of lambda phage (7). Because plasmid and phage ECEs
play roles as molecular symbionts or parasites, and can mediate horizontal gene exchange,
their biology must ultimately be studied in the context of host populations they invade;
however, this has remained difficult due to the dearth of suitable model systems of
ecologically and genotypically well-constrained bacterial populations.
Here we take a population-genomic approach to determine carriage of different types of
ECEs in a recently established model for ecologically and genetically cohesive bacterial
populations (8), asking whether different ECEs (i) are primarily associated to host
-66-
phylogeny or ecology, (ii) show evidence for distinct transfer (and loss) patterns, and (iii)
display different micro-evolutionary patterns. We use marine bacteria of the family
Vibrionaceaeas our model for environmentally differentiated host populations. These have
previously been identified as genotypic clusters with characteristic distribution among
environmental samples suggesting that they partition resources in the coastal ocean by
differential occurrence among the free-living and associated (with suspended organic
particles and zooplankton) fractions of bacterioplankton (9-11). Many of these populations
do, however, co-occur on the surfaces and in the guts of larger marine animals providing
opportunity for transfer of ECEs via occasional contact. On the other hand, sampling during
different seasons has revealed strong temporal differentiation with the same 'habitat' type
often occupied by season-specific populations (9, 10).
Finally, recent analysis of
recombination has indicated that these ecological population display cohesive behavior in
terms of gene flow, making it possible for adaptive genes to spread in a population-specific
manner (12). Because of these properties, these clusters are hypothesized to represent
natural populations and provide a platform to inquire the diversity and dynamics of ECEs.
To explore the diversity of ECEs within host populations, we screened a large collection
of isolates obtained from the coastal ocean in the spring and fall of 2006 (8). We aimed at
comprehensively sampling and sequencing all detectable ECEs of different sizes to obtain
an as unbiased picture of ECE diversity as possible. ECE diversity was analyzed in a
comparative genomic framework, integrating this analysis with both phylogenetic and
habitat information of the bacterial populations in order to identify differential associations
and dynamics.
-67-
Our data reveal surprising results about the diversity and distribution of ECEs among
Vibrio populations and reject several basic hypotheses about eco-evolutionary dynamics
based on prior literature. First, contrary to the expectation that due to the requirement of
cell-to-cell contact for transmission, plasmids should be predominantly associated with
isolates recovered from biofilms, we show that they are significantly enriched in fee-living,
planktonic cells. Second, our data show that so-called non-transmissible plasmids, which
are the most common type of ECEs in Proteobacteria (13), are in spite of their presumed
inability to self-mobilize, able to transfer rapidly and frequently.
This suggests that
currently unrecognized transfer mechanisms are at work. Finally, we show that plasmidlike temperate phages occur at considerable frequency within cells suggesting that these
play a more important environmental role than previously anticipated.
3.3. Methods
3.3.1. Vibrio isolates and initial screening for plasmids
Isolates were selected from a previous collection of Vibrionaceaecarried out to ascertain
co-existence of ecologically and genotypically structured populations in the same seawater
samples (8). These isolates had been frozen after minimal handling for purification to
reduce potential for plasmid and gene loss. A total of 660 isolates from spring (4/28/06)
and fall (9/6/06) samples were screened for ECE presence by using gel electrophoresis of
DNA that had been extracted using a modified QIAGEN plasmid purification kit. Briefly,
3 ml of buffers P1, P2 and P3 were sequentially added to a cell pellet, obtained after growth
in 30 ml of 1% Tryptic Soy Broth (TSB) media with 2% NaCl (BD Bacto) for 20 hours.
ECE DNA was further purified by phenol:chloroform extraction followed by precipitation
with isopropanol. DNA was re-dissolved in 500 pl TE buffer of which 100 l and 400 pl
-68-
were used for detection of small plasmids (<20 kbp) on 0.8% agarose gels and of large
plasmids (>20 kbp) on 1% agarose gels, respectively. The latter gels were run for 16-20
hours at 4*C and at 5-10 volt/cm. For each isolate, three replicate detection and isolation
procedures were performed.
3.3.2. ECE sequencing
To obtain large enough quantities of purified ECE DNA for sequencing, isolation was
performed from 200 ml cultures using the same protocol as above except that 4 ml of
buffers P1, P2 and P3 were added. Small bands (<20kbp) was electrophoretically separated
and extracted from gels using the QIAEX II Gel Extraction Kit (Qiagen Inc.). Larger bands
(>20kbp) were cut from the gel and 2 volumes of water and 3 volumes of Buffer QX1
(Qiagen Inc.) were added. The gel slices were incubated at 50*C for 10 min or more until
the agarose was completely solubilized. Lastly, the ECE DNA was precipitated from the
supernatant by adding 1/10 volume of 3M sodium acetate (pH 5.2) and 0.7 volumes of
room-temperature isopropanol.
For sequencing of small ECEs, DNA libraries were prepared using the modified Illumina
sequencing protocol and multiplexed 36-fold using combinations of six barcodes as
described previously (14, 15). Barcoded DNA libraries were normalized and mixed for
sequencing on Illumina GAIIx sequencers and the data were analyzed using the Illumina
pipeline 1.4.0 to generate fastq files. Libraries of ECE DNA from larger bands were
prepared using Nextera DNA Sample Preparation Kits (Roche Titanium-compatible)
(Epicentre, Inc.) with 36 adapters (Table 1). Thirty or 36 DNA libraries, each tagged with
different adapters, containing the same amount of total DNA, were mixed and sequenced
using the Roche GS FLX system.
-
69-
3.3.3. Assembly of plasmid contigs
Plasmids sequenced with 454 were assembled using Mira v.3.4.1 (16). Plasmids sequenced
with Illumina were assembled by trimming adapter sequences from the reads using
Cutadapt (17), followed by running the velvet assembler (18). The above assembly
procedure was partially automated with additional manual adaptation. To achieve the
highest accuracy, multiple assemblies were applied with varying parameters on subsets of
the data, both random and conditional on kmer abundance. Assemblies that most closely
matched the expectations of genome size and connectedness indicated by the kmer
spectrum were chosen for further analysis. Repeated trial and error were performed on the
assemblies that did not match the expectations well enough.
3.3.4. Annotation of proteins
We annotated the ORFs and the corresponding function of the encoded proteins (in terns
of FIGFAMS and Subsystems) using the RAST tools (19). Ten short bands for which no
ORFs could be annotated were removed from any further analysis. From the total set of
4,751 proteins and 5 RNA genes annotated in the remaining bands, we built families of
protein orthologs using orthoMCL (20).
3.3.5. ECE identification from band contigs
Because each gel band can, in principle, contain more than one independent element, we
developed a bioinformatics pipeline to identify independent ECEs. We constructed a
network of sequence similarity based on the gene content overlap across all contigs from
all bands. For any given pair of contigs a and b belonging to different bands we computed
two similarity metrics based on the number of protein families shared between the contigs.
Let n and m be the number of proteins encoded in each member of the pair (excluding
-
70-
duplications) and s be the number of shared protein families between the contigs. Local
similarity (LS, OLS:1) between the contigs was defined as LS(ab) = s/min(nm). Global
similarity (GS, OGS:1) was defined as GS(ab) = s/max(nm). While high values for LS
are obtained when the protein family content of a contig is similar to that encoded in a
segment of a larger contig, large GS values can only be obtained when the similar protein
content involves a large fraction of the proteins encoded in each contig.
Next we performed the following ECE separation algorithm:
1) Contig families were built using MCL (21) to cluster contigs above a global
similarity threshold of GS>0.7;
2) A single contig was considered as an ECE if it was circularized and/or it belonged
to one of the contig families defined in step 1;
3) For each given contig - even for those included in the single contig ECE set defined
in step 2 - we considered whether it could be a fragment of a larger ECE consisting
of two or more contigs from the same band. We therefore looked for contigs that
matched ECEs from the same band in a way to suggest that they could be part of a
single larger, multi-contig ECE. In that way, the set of potential references for the
given contig (the query contig) was defined as the set constituted by any contig in
our collection that i) was larger than the query contig (in terms of number of
encoded proteins) and ii) had a LS above 0.5 with the query contig. To prioritize
contigs of high quality as references, we removed any other contig from the set of
potential references if some of the potential references were one-contig ECEs;
4) For each potential reference contig we computed it's LS with each contig in the
-71-
same band to which the query contig belonged (the query band). Contigs in the
query band were ordered by the LS value in decreasing order and grouped
sequentially. A joined global similarity (JGS) to the reference was computed at
each contig grouping step;
5) We kept the group of contigs in the query from which the largest JGS value was
obtained (the optimal JGS) in step 4;
6) Steps 4 and 5 were repeated for each potential reference, and the reference giving
the largest optimal JGL was considered as the best reference for the corresponding
set of contigs in the query band. Such set of contigs was proposed as a multi-contig
ECE;
7) The proposed multi-contig ECE was validated by checking that the joined contigs
had similar read coverage in the same host. We used SSAHA2 (22) for mapping
the reads and SAMtools (23) to compute the mapping coverage;
8) Confirmed multi-contig ECEs were separated from the original band. We found
one multi-contig ECE per band at most.
Any contig that was used as a reference for a confirmed multi-contig ECE was included in
the set of one-contig ECEs. Some contigs that were initially defined as single contig ECEs
were removed from that category because they could be integrated in a larger (multi-contig)
ECE. Contigs in a band that were not previously defined as single contig ECEs or were not
integrated in a multi-contig ECEs using a reference were considered by default as part of
the same and separated (low-evidence) ECE. Such remnant ECE could have one or more
contigs.
-72-
3.3.6. Classification of ECEs.
For detecting conjugative and mobilizable ECEs, we used a very significant update of the
protein profiles previously assembled in (24). For conjugative plasmids, we first searched
for matches to TraUNirB4 from each of the mating pair formation (MPF) system families
defined previously (25) since this is the only protein that is associated with all known Type
four secretion systems (T4SS) (or at least the only sufficiently conserved in sequence). We
then gathered all proteins found within a frame of -20/+20 ORFs around the TraU/VirB to
determine whether a functional T4SS was present. For each MPF type, we carried out
similarity searches between all proteins and clustered them into families. These families
were aligned, analyzed and curated. We iterated based on criteria such as sensitivity and
specificity, and then made multiple alignments that were used to build protein profiles with
HMMER (26, 27). This led to a database of -120 protein profiles associated with
conjugation. These correspond in general to the known essential proteins in each system
(albeit a few evolve too fast and give poor sequence similarity hits). The profiles were then
searched using HMMER in the proteomes of ECEs. We filtered the results by using an Evalue threshold of 0.01, and coverage (ratio between the target and query lengths) of more
than 0.5. Clusters are based on the findings in the entire replicon.
For detecting phages, we blasted the contigs against the ACLAME Database of Mobile
Elements v. 0.4 (28) using blastp with parameter E set at le-10, in which we unified the
phage and prophage categories into a single "phage" category so that we ended up with two
ECE categories: "phage" or "plasmid" proteins. If a given protein from our contigs had a
match with at least one phage protein in ACLAME, and there were no matches with plasmid
targets, the protein was classified as of phage origin. The same single classification was
-
73-
used to decide whether a protein was associated to a plasmid (whenever there was at least
a plasmid match and no matches with phage targets). When there were both matches with
at least a plasmid protein and a phage protein, proteins were classified as mixed (with
independence of the number of matches within each category). If the percentage proteins
recruited to phage was > 20% and also more than 1.5 times the percentage recruited to
plasmid, we called it a phage. Then all of other ECEs were called non-transmissible ECEs.
3.3.7. ECE families and network of gene sharing among families
For analysis of the history of ECE transfer among populations, ECE families were
identified based on gene content and sequence similarity with a cutoff of 60 and 97% in
order to integrate over more ancient and recent transfer, respectively. These families were
based on the contig families identified above (Ch.3.3.5.), after removal of those contigs
that were integrated into multi-contig ECEs. Multi-contig ECEs were integrated in the
same family of the corresponding reference contig; however, if the reference contig was
not already a member of a family, a new family was created. ECEs defined as remnants
were not part of any ECE family.
To determine the relationship among the ECE genomes in terms of shared genes, ECEs
were assigned to groups based on clustering of shared proteins. In brief, open reading
frames (ORFs) were identified using Glimmer 3.0 (29), followed by the clustering of ORF
proteins using OrthoMCL (30, 31), in which a minimum 97% coverage of the longer
sequence was required as well as an e-value cut-off of 10-5. Whole genomes of ECEs were
then clustered based on these shared proteins, using the FT ClustNSee clustering algorithm
in Cytoscape (32-34).
-74-
3.4. Results and Discussion
3.4.1. ECE Characterization
Gel electrophoresis was employed to screen 660 Vibrio isolates for the presence of ECEs.
This revealed 140 DNA bands ranging between 1 and 200 kbp distributed across 101 Vibrio
strains (15.3% of all isolates). DNA extracted from each band was sequenced using next
generation sequencing (Illumina or 454) and assembled into 270 contigs containing a total
of 4,246,711 bp with a size range between 1 and 95 kbp (Table 2). The GC content within
the contigs varied between 36% and 53% (average value of 44.3%). By using a novel
bioinformatics pipeline, we were able to identify 187 ECEs from these contigs.
3.4.2. Proteins of ECEs
We annotated 4,751 protein, five 5S RNA and three tRNA genes from the 187 ECEs
(supplementary table 1). With 2,992, the majority of proteins (63.0%) are hypotheticals
according to standard RAST annotation. However, when comparing these results to
annotation using the ACLAME database (version 0.4), which is a curated collection of
prokaryotic mobile genetic elements from various sources (phages, plasmids, transposons
and genomic islands) (28, 35), only 42.8% of proteins were categorized as hypothetical.
This discrepancy demonstrates the value of curated databases for annotation but also that
ECE function remains overall poorly characterized.
Among the -37% (1,759) proteins with functional annotations, we identified 178 that are
typically employed by plasmids for maintenance within their hosts (5). These fell into the
following categories: 50 resolvases, 29 replicases (36), 91 partitioning systems (37), 41
toxin-antitoxin systems (38), and 17 restriction-modification systems (39). The largest
functional category were 296 proteins that belong to T4SS systems (25), whose major
75
-
-
function is to mediate transfer of conjugative and mobilizable ECEs. We also identified 70
proteins belonging to T6SS systems (40), now recognized as a complex machinery that can
pump effector proteins into recipient cells playing.a role in pathogenicity. Our finding of a
relatively high incidence of T6SS proteins within Vibrio ECEs suggests that either T6SS
play roles outside of pathogenicity or that many of the populations harbor potential
pathogens (41).
In addition to the above systems, which do not offer the plasmids any additional function
beyond their own maintenance and transfer, we also identified a few groups of proteins
with potentially beneficial functions for the hosts. Among these are 14 proteins involved
in amino acid metabolism, 21 in carbohydrate metabolism and another 21 in stress
responses. To what extent these proteins are functional in pathways that may benefit the
host, however, remains unknown at this point.
Interestingly, we also identified three 5S rRNAs on ECEs. That the detection of these genes
is not an artifact of contamination with host chromosome is further confirmed by screening
of these ECEs for the presence of 16S and 23S rRNAs, which are usually detected adjacent
to 5S rRNAs within the same operon (42). Our results showed the 5S rRNAs occurred
alone suggesting that these ECEs act as carriers for these genes. This observation further
strengthens previous observations that conserved rRNA molecules can be spread by HGT
with plasmids as vehicles (43, 44).
3.4.3. Distribution of ECEs among isolates and populations
To investigate the distribution pattern of the ECEs among the 25 previously characterized
Vibrio populations (8), we mapped ECE presence onto the phylogenetic tree of all initially
screened 660 isolates. Our results show that ECEs are broadly but not evenly distributed
76
-
-
among host populations (Figure 1). For example, no ECE was detected in groups 5 (V.
anguillarum), 7 (V. fischeri/logei), 9 (V. breoganii) and 10 (V. sp.) whereas ECEs were
detected in all four isolates in group 6 (V. sp.) and six of ten of the isolates in group 14 (V.
kanaloae) (Figure 1). Absence of ECEs in our collection of V. anguillarumand V. fischeri
is in conflict with previously published results by other research group where ECEs were
detected in both species (45, 46). What causes this difference is unknown but may reflect
previous study of ECEs in the context of environmental stress (such as heavy metal
contamination) on bacterial populations since the ECEs detected previously could be linked
to detoxification and other stress related functions (47, 48).
Next, we asked whether the lifestyle of the Vibrio strains at the time of isolation showed
any correlation with ECE presence. As mentioned above, association with one of four size
fractions was used to ecologically categorize the isolates where the smallest (<lpm)
fraction indicated free-living lifestyle whereas presence in larger size fractions suggested
attachment to particles or organisms. We found strong enrichment of ECEs in the freeliving phase with 101 ECEs (54%) being associated with the smallest fraction. This
association counters the intuition that particle associated bacteria, which live in more dense
communities, are more prone to acquire mobile elements. A possible explanation for this
discrepancy is that the high incidence of ECEs in small size fractions reflects stability
within host and/or environmental selection, rather than high transmission. On the other
hand, conjugational plasmids detected in this study are primarily of the F-type, which
possess T4SS that due to their thick, flexible pili can mediate plasmid transfer in liquid
better than on surfaces (see section 3.4.4).
-77-
I..-LW
I
Figure 1. Distribution of ECEs among Vibrio hosts. Vibrio phylogeny based on the
hsp60 protein-coding gene. Colored rings indicate size fractions and dark bars indicate
the presence of at least 1 ECE. Populations were labeled in numbers indicated in the
shadow areas. The closest named species to numbered populations are as follows: P1,
Enterovibrio calviensis; P2, Enterovibrio norvegicus; P3, Vibrio ordalii; P4, Vibrio
rumoiensis;P5, Vibrio anguillarum;P6, Vibrio sp.; P7, Vibriofischeri/logei;P8, Vibrio
fischeri; P9, Vibrio breoganii;P10, Vibrio sp.; P11, Vibrio splendidus cluster 1; P12,
Vibrio sp.; P13, Vibrio crassostreae;P14, Vibrio kanaloae;P15, Vibrio cyclitrophicus;
P16 and P17, Vibrio tasmaniensis;P18 to P25, Vibrio splendidus.
-
78-
The frequency distribution of ECEs per host cell shows that nearly half carry a single ECE,
about a quarter carry 2 ECEs and the remainder contain between 3 and 11 ECES (Figure
2). Some unusual combinations of ECEs were detected in several isolates. For example,
strain FF472 contains a phage element, two mobilizable, one conjugational and three nontransmissible ECEs. In a separate case, strain FF112 contains, in addition to a phage
element, two different types of conjugative plasmids, one mobilizable plasmid and one
non-transmissible ECE. While it is common to carry multiple plasmids within one cell, it
is very unusual to find more than one type of conjugative plasmids in one isolate.
40CO
30-
0)
20-
0
0
.10-
E
01
2
3
4
5
6
7
8
9
10 11
number of ECEs per strain
Figure 2. Distribution of the number of ECEs per strain for all Vibrio isolates with
at least one ECE. Nearly 50% of the strains with ECEs have only 1 element; however
the distribution has a long tail, which includes one isolate with 11 ECEs.
-
79-
3.4.4. Classification of ECEs
All 187 ECEs were classified into four types: conjugational, mobilizable and nontransmissible plasmids, and bacteriophage. Conjugational and mobilizable plasmids both
contain relaxases while only the first also encode a T4SS, which is necessary for selftransmission (25). Based on these criteria, 24 were defined as conjugative ECEs. All of
these contained at least five T4SS coding genes but only 12 also had a detectable relaxase
gene. 38 putative mobilizable plasmids were identified based on the presence of only a
relaxase gene (Table 2), which presumably require a T4SS to act in trans for transmission
and hence at least occasional coexistence with a conjugative plasmid. Phages were
identified by blasting the ECE sequences against the ACLAME Database. ECEs containing
only genes annotated as of phage origin were defined as such. However, some ECEs were
found to carry both plasmid and phage proteins, in which case, we defined the ECEs as
bacteriophage, if phage proteins accounted for more than 20% of the total proteins and
their presence are 1.5 times more than plasmid proteins. In total, 22 phage ECEs were
identified in our collection based on these criteria. In addition to phages, and conjugative
and mobilizable plasmids, there were 103 ECEs that did not meet criteria of any of these
three categories and were therefore, in accordance with the scheme proposed by Smillie et
al. (13), defined as "non-transmissible" ECEs. Overall, the ECE composition in our Vibrio
collection is, with 14.5% conjugative, 23.0% mobilizable and 55% non-transmissible ECEs
(Table 3 and 4), slightly different from the Proteobacterial average, which is 20%
conjugative, 30% mobilizable and 50% non-transmissible ECEs (3).
We further categorized the types of T4SS found in conjugative plasmids since different
types possess properties informative for interpretation of ECE mobility. Sequence
-80-
comparison suggested that 16 T4SS were of the F type, 6 of the T type and 2 of the G type.
To date, Type G T4SS has not been fully characterized and its presence has only been
reported in integrative conjugative elements (ICEs) and very rarely in plasmids (http://dbmml.sjtu.edu.cn/SecReT4/index.php). Type F have thick flexible pili that allows high
frequency of conjugation in liquid whereas type T has rigid pili that allow high frequency
conjugation only on solid surfaces (49). The prevalence of the F type is consistent with
preferential occurrence of ECEs in free-living cells and suggests there is selection for
mating in liquid rather than in biofilms. Moreover, the broadest-host range plasmids are
found among type T, whereas F-like plasmids tend to be narrow host-range, suggesting
that the conjugative plasmids in this dataset may be limited to fairly closely related hosts.
We also identified one ECE (m055) from strain FF112 that seems to contain 3 T4SS: 2
type F and 1 type T. This is unexpected since multiple T4SS in a single plasmid are
atypical. Although this might point to an interesting biological function, it will have to be
confirmed that ECE m055 does not represent an artifact assembled from multiple,
independent ECEs. Additionally, the same isolate harbors ECE m058, which contains a
type G T4SS so that 3 types of T4SS co-exist within the same host. Taken together, this
study showed for the first time evidence of coexistence of more than two types of
conjugative plasmids in one single strain.
3.4.5. ECE families and analysis of transmissibility
Classification of all 187 ECEs into families according to their sequence similarity
apportioned 93 of the ECEs among 31 multi-member families, while 94 ECEs remained
singletons. The numbers of ECEs detected within multi-member families ranged from 2 to
13 (Figure 3). Family I, a siderophore synthesis plasmid family, will be described in detail
-
- 81
in the next session (Ch.3.4.6.) Family II contained a disproportionally large number of
members, represented as the largest dot in Figure 3. Further analysis of the protein
sequences of ECEs from family II identified 27 ORFs, among which three share high
sequence similarity with phage genes and 12 other genes are homologue to phage genes,
0,3-
0
O
*
**4
phage
E
Emobilizable
*non-transmissible
0)
06
Econjugative
13 ECEs
C
2 ECEs
2
5
I
I
I
I
10
20
50
100
Size (Kbp)
Figure 3. ECE family diversity as a function of family size, ECE size and classes.
Average percentage identities are calculated for each family and plotted against the size
of the element in base pairs. The size of the points indicates the number of members in
the ECE family. The smallest size is 2 and the largest 13 (a widespread phage family).
The two largest families are labeled as I and II, representing a siderophore plasmid family
and a phage family, respectively.
suggesting that this may be a new phage family that can propagate in a plasmid-like
fashion. The categorization into families also allows analysis of distribution of specific
types of ECEs among the Vibrio populations in order to infer host-range and evolutionary
turnover. A network analysis of ECE distribution among potential hosts shows that while
many multi-family ECEs are population-specific, about 63% are distributed across
-82-
populations suggesting broad host-ranges (Figure 4). This is especially the case for the
multi-member phage families, which appear to be able to propagate in diverse hosts.
Surprisingly, a large portion of the non-transmissible ECEs, however, also displays broad
host range. Moreover, because most highly related ECEs are also in the non-transmissible
B)
A
- - -
-
Non-transmissible
Mobilizable
Phage
Figure 4. ECE family distribution across the Vibrio phylogeny. A) Network
connections link genotypes sharing ECE families with an average percentage identity
> 60% and their classification according to backbone genes and transmissibility. B)
Network showing ECE families with an average percentage identity > 97% and their
classification according to backbone genes and transmissibility.
category and because their nucleotide diversity is generally lower than that of their hosts,
transfer, rather than vertical inheritance appears to be the rule for this type of ECE. This
pattern of primarily low nucleotide diversity among members of ECE families also
suggests rapid evolutionary turnover, i.e., ECEs arise, spread and are lost frequently.
Finally, considering that most of the closely related ECEs are classified as nontransmissible, the conclusion of rapid turnover is puzzling and suggests that a currently
unrecognized, more direct transfer mechanism is at work.
-83-
3.4.6. Genome dynamics of ECEs
Because of the seemingly rapid turnover of ECEs, we investigated to what extent ECEs
themselves are evolutionarily stable entities by constructing a network of recently
exchanged genes. To approximate relatively recent transfer, we first clustered genes into
*
*
II
Ar
0
0
Proteins
I
Vibrio crassostreae
Conjugative
A
Vibrio ordall
V
Vibrio sp.
N
VIbrIo fischer
Mobilizable
Non-transmissible
Phages
Vibriosplendidus
VIbrio tasmanlensis
Vibriocyclitrophicus
Vibrio kanaloae
Enterovibrio norvegicus
Enterovlbrio calviensis-like
Figure 5. ECE genome cluster network. ECEs are connected through proteins (blue
dots) shared by at least two ECEs at 97% sequence similarity. The diameter of an ECE
symbol indicates the relative size of the genome of the ECE. Three ECE clusters (I, II
and III) were selected using FT ClusterNSee as examples of ECE genome dynamics
(see Figure 6).
closely related families (97% in sequence identity) and then determined how many ECEs
share these families. This shows a network that suggests high incidence of gene exchange
within a hub of strongly connected conjugative plasmids, which share many types of genes
-84-
with T4SS being most frequent. Many non-transmissible and a few mobilizable plasmids
are also strongly connected to this hub while links to the remainder of plasmids and,
especially, phage are sparse suggesting relatively minor gene exchange (Figure 5). Several
plasmid and one phage family are not connected to the network.
The structure and connections of non-transmissible and mobilizable plasmids within this
network allows some conclusions as to their evolutionary origins, which we illustrate with
the following examples.
The first example is a cluster of non-transmissible plasmids (labeled I in Figure 5 and
corresponding to family I in Figure 3), which share a cluster of siderophore genes with
otherwise unrelated plasmids detected in several other Vibrio species (50, 51). The circular
plasmids of family I contain three modules (Figure. 6 I). The first encodes a plasmid
partition system (blue ORFs) that allows the plasmid to be maintained in its host strain.
The function of the second module (orange) remains unidentified. The third module (green)
encodes a protein complex that shares high sequence identity with siderophore biosynthesis
genes from a previously characterized plasmid family from V. vulnificus and V.
parahaemolyticus(50, 51) suggesting that these genes are mobile among ECEs. Finally,
plasmids within family I are somewhat atypical since they display higher sequence
divergence suggesting persistence over longer evolutionary times than many of the other
-85
-
non-transmissible plasmids, which consist of clusters of highly identical genomes.
1
2JN
4PO
6Pfl
80aO
1QOO
1.O
14.000
1"AN3 1".OO
2"WNI
2%OO
24AOS
24000
24000
3U,000 32.ao
M0013
m303F1
rn31q
MM
moo
Io
III
1
1,000
2,000
3,000
4,000
5,000
11
6,000
7,000
8,000
9,000
10,000
11.000
12,000
,m173, m176, m184
A
, mO83
B
C
D
E
m135, m137, m138, m142
m120, m125
m156
------ --------
B
mOOl, m099
D
m181
m052
E mon1, moS, mo
C
--
Figure 6. Sequence alignment of ECEs in three representative clusters from the network analysis. Homologous sequence
regions are highlighted in grey as in cluster I and Ill. The corresponding network of each cluster is also included. Cluster I is a
highly conserved siderophore synthesis plasmid family. Cluster 11 contains two conjugative and one non-transmissible ECEs.
Cluster Ill, contains 23 mobilizable and one non-transmissible ECEs, were grouped into 5 sub clusters, indicated as A to E.
-WNWON"im"
-
.-
.
..
............
MW
The second example (cluster II in Figure 6 II) illustrates that gene gain or loss may change
one plasmid category into another. ECE m161 and m096 are conjugative plasmids that
share almost all of the backbone genes (blue ORFs), which are responsible for conjugation.
They differ, however, in two relatively large regions, which are present in m161 but absent
in m096. These regions are also shared by the non-transmissible plasmid m031, which is
overall more similar to m161 except for the lack of genes responsible for conjugative
transfer. This indicates that non-transmissible plasmids may originate from conjugative
plasmids and vice versa via gain or loss of T4SS and relaxase genes.
The final example (cluster III in Figure 6 111) illustrates a case where five unrelated families
containing 23 mobilizable plasmids are connected only by few shared backbone genes.
Family A, B, C and D have mostly Type P relaxase genes in common while the remainder
of their genome is unrelated. Even more indirect is the connection of Family E to the other
families in this group. It shares Type C relaxase genes with ECE m052, which has some
accessory genes in common with ECE m181. The latter is connected to Families A, B, C
and D via Type P relaxase genes.
The above three examples illustrate that the overall architecture of plasmids can change
over relatively short evolutionary timespans and that transitions between different
categories of plasmids are possible via gain or loss of relaxase and T4SS systems.
Considering the high mobility of ECEs among different host populations detected in this
study, this may lead to rapid transfer and reassortment of functions of potential benefit to
hosts.
-87-
3.5. Summary
In summary, we isolated 187 ECEs from 660 strains categorized into 25 different
ecologically and genetically cohesive populations. We identified the following elements:
22 bacteriophages, 24 conjugative ECEs, 38 mobilizable ECEs. The latter require cooccurring conjugative plasmids for successful transfer. Moreover, 103 non-transmissible
ECEs that do not encode any genes for self-transfer were detected. We further found that
ECEs were significantly enriched in free-living Vibrio cells, suggesting association of
ECEs with host environment. Our data show that non-transmissible plasmids appear to be
most common among Vibrio ECEs and that they may have been transferred recently and
frequently through mechanisms yet to be uncovered. This suggests that these plasmids have
most certainly been misnamed. The high incidence of putative temperate phages that
appear to propagate as plasmids is surprising. Although plasmid-like phages have been
previously described, the phages detected here appear novel and their prevalence suggests
a previously unanticipated role in the marine environment. Finally, the dynamic property
of ECE genomes offer their host strains a rich supply of external genetic materials that may
allow the rapid assembly of different functions.
-88-
Table 1. An extend bar codes set beyond Roche Titanium-compatible bar codes kit
Sequence (5'-3')
MID names
MID 13
MID 14
MID 15
MID 16
MID 17
MID 18
MID 19
MID 20
MID 21
MID 22
MID 23
MID 24
MID 25
MID 26
MID 27
MID 28
MID 29
MID 30
MID 31
MID 32
MID 33
MID 34
MID 35
MID 36
MID 37
MID 38
CCATCTCATCCCTGCGTGTCTCCGACTCAGCATAGTAGTGAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGCGAGAGATACAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGATACGACGTAAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGTCACGTACTAAGATGTGT
ATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTCTAGTACAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTACGTAGCAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGTGTACTACTCAGATGTGT
ATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGACGACTACAGAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGCGTAGACTAGAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGTACGAGTATGAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGTACTCTCGTGAGATGTGT
ATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGTAGAGACGAGAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGTCGTCGCTCGAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGACATACGCGTAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGACGCGAGTATAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGACTACTATGTAGATGTGT
ATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGACTGTACAGTAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGAGACTATACTAGATGTGT
ATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGAGCGTCGTCTAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGAGTACGCTATAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGATAGAGTACTAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGCACGCTACGTAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGTAGACGTAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGCGACGTGACTAGATGTG
TATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGTACACACACTAGATGTGT
ATAAGAGACAG
CCATCTCATCCCTGCGTGTCTCCGACTCAGTACACGTGATAGATGTG
TATAAGAGACAG
-
89-
Table 2. The number, GC content and size of contigs
ECE ID
Contig
ID
GC
Contig
Size
m001
m002
m003
1
1
1
2
1
2
1
2
3
1
1
2
1
2
3
1
2
3
1
1
2
3
4
1
1
2
3
4
5
6
7
8
1
2
3
1
1
2
0.41
0.47
0.41
0.42
0.37
0.42
0.52
0.46
0.43
0.41
0.42
0.36
0.53
0.45
0.47
0.41
0.41
0.41
0.41
0.53
0.47
0.41
0.52
0.41
0.42
0.41
0.4
0.43
0.42
0.42
0.42
0.41
0.42
0.44
0.47
0.44
0.42
0.39
48.4
6.2
30.7
15.9
3.9
19
6.5
4.2
13.7
48.4
5.2
5.1
14.2
12.7
11.8
13.3
9.4
15
48.4
7.9
5.8
3.9
4.2
48.4
27
18
20.3
20
36.8
23.5
18.7
19
2.8
3.1
2.8
95.4
19.6
7.6
m004
m005
m006
m007
m008
m009
m010
m011
m012
m013
m014
m015
m016
m017
m018
m019
m020
m021
m022
m023
m024
m025
m026
m027
m028
m029
m030
m031
m032
m033
m034
m035
m036
m037
-90-
3
4
1
2
3
1
2
3
1
1
1
2
1
1
1
1
2
3
1
1
2
3
4
1
2
3
1
1
1
2
1
1
2
1
2
1
1
1
2
0.41 10.7
0.41 5.8
0.45 4.2
0.51 3.4
0.51 3.4
0.42 11
0.42 21.9
0.42 15.9
0.51 8.1
0.44 40.6
0.42 6.6
0.43 6.7
0.43 17
0.41 8.1
42.4
0.5
0.51 16.5
0.46 13.7
0.41 9.6
37.9
0.5
0.46 11.8
0.47 8.1
40.4
0.5
0.46 9
0.47 10.5
0.49 10
4.7
0.5
0.44 95.9
0.41 24
13.3
0.4
5.6
0.4
0.48 47.3
0.46 33.4
0.48 20.9
0.47 24.2
0.43 14.5
0.42 19
0.51 16
0.48 23
0.46 14
m038
m039
m040
m041
m042
m043
m044
m045
m046
m047
m048
m049
m05O
m051
m052
m053
m054
m055
m056
m057
m058
m059
m060
m061
m062
m063
m064
m065
3
1
1
1
2
3
1
1
1
1
1
1
1
2
1
1
2
1
1
1
1
2
1
1
2
3
4
5
6
1
1
2
1
2
1
1
1
1
1
1
1
0.42
0.42
0.42
0.48
0.43
0.43
0.44
0.44
0.45
0.42
0.44
0.49
0.45
0.41
0.41
0.41
0.41
0.41
0.46
0.4
0.47
0.45
0.41
0.46
0.41
0.43
0.43
0.49
0.52
0.42
0.42
0.42
0.48
0.46
0.42
0.45
0.47
0.46
0.51
0.43
0.48
13.2
17.6
20.1
7
6.1
5.8
35
35
43.1
18.3
11.4
35.8
68.3
45.8
24
36.9
12.2
24.6
47.1
12.6
32.8
11.8
24.5
64.3
25.2
20.6
15.4
25.4
17.5
19.4
46.9
17.3
59.2
48.6
17.8
25.4
12.6
62.6
17.1
21.9
16.8
m066
m067
m068
m069
m070
m071
m072
m073
m074
m075
m076
m077
m078
m079
m080
m081
m082
m083
m084
m085
m086
m087
m088
m089
m090
m091
-91-
1
1
1
1
2
1
1
1
1
1
2
3
4
5
1
1
2
3
4
5
6
1
1
1
1
1
1
2
1
1
1
1
1
2
3
4
1
1
1
1
2
0.41
0.48
0.47
0.46
0.48
0.48
0.48
0.42
0.45
0.49
0.44
0.47
0.51
0.44
0.39
0.51
0.51
0.45
0.46
0.52
0.46
0.42
0.42
0.46
0.43
0.36
0.42
0.44
0.45
0.39
0.41
0.5
0.44
0.44
0.44
0.44
0.38
0.38
0.44
0.42
0.41
24.5
48.1
12.6
32.8
21.7
16.6
48.2
20
43.1
9.5
3
11.3
5.6
3.5
5.2
1.7
2.7
2.5
1.5
2.3
2.5
2.8
9.6
2.1
5.1
5.8
2.6
2.4
2.1
3.6
2.5
2.2
3.5
5.1
2.5
4.4
3.5
2.1
5.5
1.7
1.8
m092
m093
m094
m095
m096
m097
m098
m099
mlOO
m101
m102
m103
m104
m105
m106
m107
m108
m109
ml10
mill
m112
m113
m114
ml15
m116
m117
m118
m119
m120
m121
m122
m123
m124
m125
3
1
1
1
1
1
1
2
1
1
1
1
1
1
2
3
4
5
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0.43
0.51
0.41
0.4
0.41
0.39
0.42
0.52
0.4
0.44
0.44
0.42
0.38
0.49
0.48
0.5
0.48
0.5
0.44
0.42
0.39
0.4
0.39
0.42
0.45
0.45
0.37
0.4
0.46
0.42
0.47
0.48
0.52
0.47
0.41
0.43
0.4
0.37
0.37
0.4
0.42
m126
m127
m128
m129
m130
m131
2.8
37.6
2.9
6.9
4.5
23
4.7
6
3.3
3.2
3.6
4.8
6
17.3
14.8
19.3
14.1
10.8
13.6
2.7
3.7
2.3
5
8.6
36.6
43.3
5.8
3
4.7
1.5
3
7.6
12.2
5.3
3.5
4.8
3
5.8
5.8
3
4.8
m132
m133
m134
m135
m136
m137
m138
m139
m140
m141
m142
m143
m144
m145
m146
m147
m148
m149
m150
mi51
m152
m153
m154
m155
m156
m157
m158
m159
m160
m161
m162
-92-
1
1
1
1
1
1
2
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
3
1
1
1
1
1
1
1
1
1
1
1
0.37
0.52
0.47
0.44
0.44
0.39
0.42
0.4
0.42
0.4
0.42
0.42
0.44
0.42
0.42
0.39
0.44
0.39
0.42
0.39
0.39
0.39
0.44
0.42
0.43
0.42
0.42
0.5
0.41
0.44
0.42
0.44
0.46
0.39
0.43
0.4
0.42
0.42
0.4
0.4
0.43
5.8
12.1
10.6
13.6
13.6
4
7.6
4
8.3
2.5
2.8
5.1
8.5
5.1
5.1
7.1
13.6
2.5
5.1
2.6
2.6
2.5
8.5
54.4
12.1
3
1
10.6
14.1
17.2
51.9
75.1
2.4
2.6
4.4
3.2
13.5
53.1
3.2
31.2
6
m163
m164
m165
m166
m167
m168
m169
m170
m171
m172
m173
m174
1
1
2
3
4
1
1
1
1
1
1
1
1
1
1
0.44
0.43
0.41
0.44
0.47
0.4
0.4
0.39
0.41
0.5
0.43
0.5
0.42
0.42
0.42
12.8
18.2
74.5
55.6
22.7
3
5.4
3.3
4.8
42.4
8.4
42.4
4.4
4.4
5.6
m175
m176
m177
m178
m179
m180
m181
m182
m183
m184
m185
m186
m187
-93-
1
1
1
1
1
2
1
1
1
1
1
1
1
1
0.43
0.42
0.42
0.42
0.39
0.41
0.43
0.42
0.41
0.39
0.42
0.41
0.39
0.42
38.7
4.4
5.6
5.6
1.7
2.4
8.4
7.2
2.9
3.3
4.4
2.9
3.4
2.3
Table 3. Classification, host and population of ECEs
ECE ID
m001
m002
m003
m004
m005
m006
m007
m008
m009
m010
m011
m012
Classification
non-trans
phage
non-trans
phage
non-trans
non-trans
mobilization
conjugation
non-trans
non-trans
conjugation
non-trans
Host
IS_120
1S_269
IS_269
5F_7
5S_122
5S_149
5S_22
5S_235
5S_239
5S_240
5S_5
IS_77
Season
S
S
S
F
S
S
S
S
S
S
S
S
Fraction
1
1
1
5
5
5
5
5
5
5
5
1
Species
Vibrio kanaloae
Vibrio sp.
Vibrio sp.
Vibriofischeri
Vibrio splendidus
Vibrio kanaloae
Vibrio splendidus cluster 1
Vibrio splendidus
Vibrio kanaloae
Vibrio kanaloae
Vibrio splendidus
Vibrio kanaloae
m013
conjugation
FF_110
F
F
Vibrio sp.
m014
m015
m016
m017
m018
m019
m020
m021
m022
m023
m024
m025
m026
m027
m028
m029
m030
m031
m032
m033
m034
m035
m036
m037
m038
m039
non-trans
non-trans
conjugation
phage
non-trans
conjugation
conjugation
non-trans
non-trans
non-trans
phage
conjugation
phage
conjugation
non-trans
non-trans
phage
non-trans
non-trans
mobilization
non-trans
phage
conjugation
conjugation
phage
phage
FF_136
FF_145
FF_152
FF_152
FF_174
FF_174
FF_273
FF_286
FF_286
FF_286
FF_304
FF_308
FF_308
FF_375
FF_375
FF_61
IF_145
IF_145
IF_145
IF_145
1F_145
IF_145
1F_145
IF_146
IF_146
1F_146
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
1
1
1
1
1
1
1
1
1
1
Vibrio sp.
Vibrio splendidus
Vibrio sp.
Vibrio sp.
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio sp.
Vibrio sp.
Vibrio sp.
Vibrio sp.
Vibrio sp.
Vibrio splendidus
Vibrio splendidus
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio cyclitrophicus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
-94-
m040
mobilization
IF_146
F
1
Vibrio tasmaniensis
m041
m042
m043
m044
m045
m046
m047
m048
m049
m05O
m051
m052
m053
m054
mO55
m056
m057
m058
m059
m060
m061
m062
m063
m064
m065
m066
m067
m068
m069
m070
m071
m072
m073
m074
m075
m076
m077
m078
m079
m080
non-trans
non-trans
non-trans
phage
non-trans
phage
conjugation
phage
non-trans
phage
non-trans
mobilization
conjugation
non-trans
conjugation
phage
mobilization
conjugation
phage
non-trans
non-trans
mobilization
conjugation
mobilization
non-trans
phage
non-trans
non-trans
mobilization
non-trans
non-trans
phage
non-trans
conjugation
non-trans
non-trans
non-trans
non-trans
mobilization
non-trans
IF_189
IF_279
5S_118
5S_214
5S_214
55_214
5S_214
5S_214
5S_239
5S_239
5S_268
FF_112
FF_112
FF_112
FF_112
FF_112
FF_210
FF_307
FF_307
FF_351
FF_472
FF_472
FF_472
FF_472
FF_472
FF_472
FF_472
FF_482
FF_482
FF_482
FF_482
FF_482
ZS_101
1F_145
1F_145
1F_187
IF_253
1F_292
1F_292
1F_292
F
F
S
S
S
S
S
S
S
S
S
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
S
F
F
F
F
F
F
F
1
1
5
5
5
5
5
5
5
5
5
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
Z
1
1
1
1
1
1
1
Vibrio sp.
Vibrio tasmaniensis
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio kanaloae
Vibrio kanaloae
Vibrio splendidus
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio sp.
Vibrio sp.
Enterovibrio norvegicus
Enterovibrio norvegicus
Enterovibrio norvegicus
Enterovibrio norvegicus
Enterovibrio norvegicus
Enterovibrio norvegicus
Enterovibrio norvegicus
Enterovibrio norvegicus
Vibrio sp.
Vibrio sp.
Vibrio sp.
Vibrio sp.
Vibrio sp.
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
-
95-
m081
m082
m083
m084
m085
m086
m087
m088
m089
m090
m091
m092
m093
m094
m095
mobilization
non-trans
mobilization
non-trans
non-trans
non-trans
non-trans
mobilization
mobilization
non-trans
non-trans
phage
non-trans
non-trans
mobilization
FF_3
FF_3
FF_3
FF_3
FF_3
FF_7
ZF_221
ZF_221
ZF_221
5S_149
FF_113
FF_113
FF_113
FF_1
FF_1
F
F
F
F
F
F
F
F
F
S
F
F
F
F
F
F
F
F
F
F
F
Z
Z
Z
5
F
F
F
F
F
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Enterovibrio norvegicus
Vibriofischeri
Vibriofischeri
Vibriofischeri
Vibrio kanaloae
Enterovibrio calviensis-like
Enterovibrio calviensis-like
Enterovibrio calviensis-like
Vibrio splendidus
Vibrio splendidus
m096
conjugation
FF_291
F
F
Vibrio sp.
m097
m098
m099
m100
m101
m102
m103
m104
m105
m106
m107
m108
m109
mI10
m111
ml 12
m113
ml 14
m 15
m116
m117
mobilization
non-trans
mobilization
non-trans
non-trans
mobilization
non-trans
non-trans
non-trans
non-trans
non-trans
non-trans
non-trans
phage
mobilization
non-trans
mobilization
non-trans
mobilization
non-trans
conjugation
FF_304
FF_304
FF_32
FF_59
ZF_76
ZF_76
ZS_138
IF_148
IF_263
IF_279
IF_279
1F_279
1F_279
1F_97
5F_20
5F_20
5F_275
5F_275
5F_275
5S_242
FF_191
F
F
F
F
F
F
S
F
F
F
F
F
F
F
F
F
F
F
F
S
F
F
F
F
F
Z
Z
Z
1
1
1
1
1
1
1
5
5
5
5
5
5
F
Vibrio sp.
Vibrio sp.
Enterovibrio norvegicus
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio splendidus
Vibrio sp.
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio sp.
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio splendidus
Vibrio splendidus
m118
non-trans
FF_191
F
F
Vibrio splendidus
m119
m120
m121
non-trans
mobilization
non-trans
ZF_193
ZF_193
ZF_45
F
F
F
Z
Z
Z
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio sp.
-96-
m122
m123
m124
m125
m126
m127
m128
m129
m130
m131
m132
m133
m134
m135
m136
m137
m138
m139
m140
m141
m142
m143
m144
m145
m146
m147
m148
m149
m150
m151
m152
m153
m154
m155
m156
m157
m158
m159
m160
m161
m162
mobilization
mobilization
non-trans
mobilization
mobilization
conjugation
non-trans
non-trans
non-trans
non-trans
non-trans
non-trans
non-trans
mobilization
non-trans
mobilization
mobilization
mobilization
non-trans
non-trans
mobilization
non-trans
non-trans
non-trans
non-trans
non-trans
mobilization
non-trans
non-trans
conjugation
non-trans
conjugation
non-trans
non-trans
mobilization
non-trans
non-trans
non-trans
non-trans
conjugation
non-trans
ZF_45
ZF_53
ZF_53
ZF_6
ZS_138
ZS_190
ZS_190
IF_164
IF_169
IF_243
1F_255
IF_255
1F_260
IF_260
IF_260
1F_260
IF_275
IF_283
1F_9
FF_1
FF_1
FF_1
FF_1
FF_1
FF_1
FF_146
FF_146
FF_164
FF_164
FF_172
FF_172
FF_1
FF_1
FF_1
FF_24
FF_24
FF_267
FF_267
FF_31
FF_32
FF_59
F
F
F
F
S
S
S
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
-
Z
Z
Z
Z
Z
Z
Z
1
1
1
1
1
1
1
1
1
1
1
1
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
97-
Vibrio sp.
Vibrio sp.
Vibrio sp.
Vibrio sp.
Vibrio splendidus
Vibrio splendidus cluster 1
Vibrio splendidus cluster 1
Enterovibrio norvegicus
Vibrio tasmaniensis
Vibrio crassostreae
Vibrio crassostreae
Vibrio crassostreae
Enterovibrio norvegicus
Enterovibrio norvegicus
Enterovibrio norvegicus
Enterovibrio norvegicus
Vibrio splendidus
Vibrio splendidus
Vibrio tasmaniensis
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio sp.
Vibrio sp.
Enterovibrio calviensis-like
Enterovibrio calviensis-like
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio splendidus
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio ordalii
Enterovibrio norvegicus
Vibrio tasmaniensis
m163
m164
m165
m166
m167
m168
m169
m170
m171
m172
m173
m174
m175
m176
m177
m178
m179
m180
m181
m182
m183
m184
m185
m186
m187
non-trans
conjugation
non-trans
non-trans
phage
mobilization
phage
non-trans
phage
mobilization
mobilization
non-trans
conjugation
mobilization
non-trans
non-trans
non-trans
non-trans
mobilization
non-trans
non-trans
mobilization
non-trans
non-trans
non-trans
FF_59
FF_59
ZF_53
IS_139
FF_233
FF_249
FF_262
FF_266
FF_266
FF_268
FF_268
FF_268
FF_268
FF_268
FF_268
FF_268
FF_307
FF_371
FF_371
FF_376
FF_376
FF_376
FF_376
FF_376
FF_451
F
F
F
S
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
-98-
F
F
Z
1
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio sp.
Vibrio crassostreae
Vibrio tasmaniensis
Vibrio tasmaniensis
Enterovibrio norvegicus
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio sp.
Vibrio sp.
Vibrio sp.
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio tasmaniensis
Vibrio ordalii
Table 4.Summary of strains carrying ECEs
Host
Number
of ECEs
Phage
Conjugative
ECEs
Mobilizable Non-transmissible
ECEs
ECEs
1F_145
1F_146
1F148
1F_164
1F_169
1F187
1F_189
1F_243
1F_253
1F255
1F260
1F_263
1F275
1F_279
1F_283
1F_292
1F_9
1F_97
IS_120
1S139
1S269
1S_77
5F20
5F275
5F_7
5S118
5S122
5S_149
5S214
5S22
9
4
1
1
1
1
1
1
1
2
4
1
1
5
1
3
1
1
1
1
2
1
2
3
1
1
1
2
5
1
2
2
2
1
1
1
5S235
1
5S239
5S240
5S242
5S268
3
1
1
1
1
11
1
5S_5
FF_1
FF_110
2
4
1
1
1
1
1
1
1
2
2
1
1
5
1
1
2
1
1
1
1
2
1
1
1
1
1
1
1
3
1
1
2
1
1
1
1
1
2
1
1
1
1
1
1
2
-99-
8
FF_112
FF_113
FF_136
FF_145
FF_146
FF_152
FF_164
FF_172
FF_174
FF_191
FF_210
FF_233
FF_24
FF_249
FF_262
FF_266
FF_267
FF_268
FF_273
FF_286
FF_291
FF_3
FF_304
FF_307
FF_308
FF_31
FF_32
FF_351
FF_371
FF_375
FF_376
FF_451
FF_472
FF_482
FF_59
FF_61
FF_7
ZF_193
ZF_221
ZF_45
ZF_53
5
3
1
1
2
2
2
2
2
2
1
1
2
1
1
2
2
7
1
3
1
5
3
3
2
1
2
1
2
2
5
1
7
5
4
1
1
2
3
2
3
1
1
1
2
1
1
1
2
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
2
3
3
1
1
1
1
2
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
2
1
1
1
2
1
1
-100-
1
1
1
4
1
3
3
3
1
1
1
1
1
2
ZF_6
ZF_76
ZS_101
ZS_138
ZS_190
1
2
1
2
2
1
1
-101-
1
Reference:
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
Polz MF, Alm EJ, Hanage WP. 2013. Horizontal gene transfer and the evolution
of bacterial and archaeal population structure. Trends in genetics : TIG 29:170175.
Wiedenbeck J, Cohan FM. 2011. Origins of bacterial diversity through horizontal
genetic transfer and adaptation to new ecological niches. FEMS microbiology
reviews 35:957-976.
Garcillan-Barcia MP, Alvarado A, de la Cruz F. 2011. Identification of bacterial
plasmids based on mobility and plasmid population biology. FEMS microbiology
reviews 35:936-956.
Francia MV, Varsaki A, Garcillan-Barcia MP, Latorre A, Drainas C, de la Cruz F.
2004. A classification scheme for mobilization regions of bacterial plasmids.
FEMS microbiology reviews 28:79-100.
Phillips G, Funnell BE. 2004. Plasmid biology. ASM Press, Washington, D.C.
Kan S, Fornelos N, Schuch R, Fischetti VA. 2013. Identification of a ligand on the
Wipi bacteriophage highly specific for a receptor on Bacillus anthracis. J
Bacteriol 195:4355-4364.
Ravin NV. 2011. N15: the linear phage-plasmid. Plasmid 65:102-109.
Hunt DE, David LA, Gevers D, Preheim SP, Alm EJ, Polz MF. 2008. Resource
partitioning and sympatric differentiation among closely related
bacterioplankton. Science 320:1081-1085.
Hunt DE, David LD, Gevers D, Preheim SP, Alm EJ, Polz MF. 2008. Resource
partitioning and sympatric differentiation among closely related
bacterioplankton. Science 320:1081-1085.
Preheim SP, Boucher Y, Wildschutte H, David LA, Veneziano D, Alm EJ, Polz MF.
2011. Metapopulation structure of Vibrionaceae among coastal marine
invertebrates. Environ. Microbiol. 13:265-275.
Preheim SP, Timberlake S, Polz MF. 2011. Merging taxonomy with ecological
population prediction: a case study of Vibrionaceae. Appl Environ Microbiol
77:7195-7206.
Shapiro BJ, Polz MF. 2014. Ordering microbial diversity into ecologically and
genetically cohesive units. Trends Microbiol.
Smillie C, Garcillan-Barcia MP, Francia MV, Rocha EP, de la Cruz F. 2010.
Mobility of plasmids. Microbiology and molecular biology reviews: MMBR
74:434-452.
Xue H, Xu Y, Boucher Y, Polz MF. 2012. High frequency of a novel filamentous
phage, VCY phi, within an environmental Vibrio cholerae population. Appl
Environ Microbiol 78:28-33.
Rodrigue S, Materna AC, Timberlake SC, Blackburn MC, Malmstrom RR, Alm EJ,
Chisholm SW. 2010. Unlocking short read sequencing for metagenomics. PLoS
One 5:e11840.
-
102
-
1.
16.
17.
18.
Chevreux B, Pfisterer T, Drescher B, Driesel AJ, Muller WE, Wetter T, Suhai S.
2004. Using the miraEST assembler for reliable and automated mRNA transcript
assembly and SNP detection in sequenced ESTs. Genome Res 14:1147-1159.
Martin M. 2011. Cutadapt removes adapter sequences from high-throughput
sequencing reads. EMBnet.journal 17:10-12.
Zerbino DR. 2010. Using the Velvet de novo assembler for short-read sequencing
technologies. Current protocols in bioinformatics / editoral board, Andreas D.
Baxevanis ... [et al.] Chapter 11:Unit 1115.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes
S, Glass EM, Kubal M, Meyer F, Olsen GJ, Olson R, Osterman AL, Overbeek RA,
McNeil LK, Paarmann D, Paczian T, Parrello B, Pusch GD, Reich C, Stevens R,
Vassieva 0, Vonstein V, Wilke A, Zagnitko 0. 2008. The RAST Server: rapid
annotations using subsystems technology. BMC Genomics 9:75.
Chen F, Mackey AJ, Stoeckert CJ, Jr., Roos DS. 2006. OrthoMCL-DB: querying a
comprehensive multi-species collection of ortholog groups. Nucleic Acids Res
34:D363-368.
Dongen Sv. 2000. Graph Clustering by Flow Simulation. Graph Clustering by Flow
Simulation. University of Utrecht.
Ning Z, Cox AJ, Mullikin JC. 2001. SSAHA: a fast search method for large DNA
databases. Genome Res 11:1725-1729.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G,
Durbin R, Genome Project Data Processing S. 2009. The Sequence
Alignment/Map format and SAMtools. Bioinformatics 25:2078-2079.
Guglielmini J, Quintais L, Garcillan-Barcia MP, de la Cruz F, Rocha EP. 2011. The
repertoire of ICE in prokaryotes underscores the unity, diversity, and ubiquity of
conjugation. PLoS Genet 7:e1002222.
Guglielmini J, de la Cruz F, Rocha EP. 2013. Evolution of conjugation and type IV
secretion systems. Molecular biology and evolution 30:315-331.
Finn RD, Clements J, Eddy SR. 2011. HMMER web server: interactive sequence
similarity searching. Nucleic Acids Res 39:W29-37.
Eddy SR. 2011. Accelerated Profile HMM Searches. PLoS Comput Biol
7:e1002195.
Leplae R, Lima-Mendez G, Toussaint A. 2010. ACLAME: a CLAssification of
Mobile genetic Elements, update 2010. Nucleic Acids Res 38:D57-61.
Delcher AL, Bratke KA, Powers EC, Salzberg SL. 2007. Identifying bacterial genes
and endosymbiont DNA with Glimmer. Bioinformatics 23:673-679.
Fischer S, Brunk BP, Chen F, Gao X, Harb OS, lodice JB, Shanmugam D, Roos DS,
Stoeckert CJ, Jr. 2011. Using OrthoMCL to assign proteins to OrthoMCL-DB
groups or to cluster proteomes into new ortholog groups. Current protocols in
bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.] Chapter 6:Unit 6
12 11-19.
Li L, Stoeckert CJ, Jr., Roos DS. 2003. OrthoMCL: identification of ortholog
groups for eukaryotic genomes. Genome Res 13:2178-2189.
-
103
-
19.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
Spinelli L, Gambette P, Chapple CE, Robisson B, Baudot A, Garreta H, Tichit L,
Guenoche A, Brun C. 2013. Clust&See: a Cytoscape plugin for the identification,
visualization and manipulation of network clusters. Bio Systems 113:91-95.
Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T. 2011. Cytoscape 2.8: new
features for data integration and network visualization. Bioinformatics 27:431432.
Kohl M, Wiese S, Warscheid B. 2011. Cytoscape: software for visualization and
analysis of biological networks. Methods in molecular biology 696:291-303.
Leplae R, Hebrant A, Wodak SJ, Toussaint A. 2004. ACLAME: a CLAssification of
Mobile genetic Elements. Nucleic Acids Res 32:D45-49.
Petersen J. 2011. Phylogeny and compatibility: plasmid classification in the
genomics era. Archives of microbiology 193:313-321.
Schumacher MA. 2012. Bacterial plasmid partition machinery: a minimalist
approach to survival. Curr Opin Struct Biol 22:72-79.
Schuster CF, Bertram R. 2013. Toxin-antitoxin systems are ubiquitous and
versatile modulators of prokaryotic cell fate. FEMS microbiology letters 340:7385.
Vasu K, Nagaraja V. 2013. Diverse functions of restriction-modification systems
in addition to cellular defense. Microbiology and molecular biology reviews:
MMBR 77:53-72.
Silverman JM, Brunet YR, Cascales E, Mougous JD. 2012. Structure and
regulation of the type VI secretion system. Annual review of microbiology
66:453-472.
Records AR. 2011. The type VI secretion system: a multipurpose delivery system
with a phage-like machinery. Molecular plant-microbe interactions: MPMI
24:751-757.
Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M, ColladoVides J, Glasner JD, Rode CK, Mayhew GF, Gregor J, Davis NW, Kirkpatrick HA,
Goeden MA, Rose DJ, Mau B, Shao Y. 1997. The complete genome sequence of
Escherichia coli K-12. Science 277:1453-1462.
Battermann A, Disse-Kromker C, Dreiseikelmann B. 2003. A functional plasmidborne rrn operon in soil isolates belonging to the genus Paracoccus.
Microbiology 149:3587-3593.
Kunnimalaiyaan M, Stevenson DM, Zhou Y, Vary PS. 2001. Analysis of the
replicon region and identification of an rRNA operon on pBM400 of Bacillus
megaterium QM B1551. Mol Microbiol 39:1010-1021.
Tialnen T, Pedersen K, Larsen JX. 1995. Ribotyping and plasmid profiling of Vibrio
anguillarum serovar 02 and Vibrio ordali. The Journal of applied bacteriology
79:384-392.
Dunn AK, Martin MO, Stabb EV. 2005. Characterization of pES213, a small
mobilizable plasmid from Vibriofischeri. Plasmid 54:114-134.
Reva 0, Bezuidt 0. 2012. Distribution of horizontally transferred heavy metal
resistance operons in recent outbreak bacteria. Mobile genetic elements 2:96100.
-
104
-
32.
49.
50.
51.
van Hal SJ, Wiklendt A, Espedido B, Ginn A, Iredell JR. 2009. Immediate
appearance of plasmid-mediated resistance to multiple antibiotics upon
antibiotic selection: an argument for systematic resistance epidemiology. J Clin
Microbiol 47:2325-2327.
Bradley DE, Taylor DE, Cohen DR. 1980. Specification of surface mating systems
among conjugative drug resistance plasmids in Escherichia coli K-12. J Bacteriol
143:1466-1470.
Naka H, Liu M, Actis LA, Crosa JH. 2013. Plasmid- and chromosome-encoded
siderophore anguibactin systems found in marine vibrios: biosynthesis, transport
and evolution. Biometals : an international journal on the role of metal ions in
biology, biochemistry, and medicine 26:537-547.
Naka H, Lopez CS, Crosa JH. 2008. Reactivation of the vanchrobactin siderophore
system of Vibrio anguillarum by removal of a chromosomal insertion sequence
originated in plasmid pJM1 encoding the anguibactin siderophore system.
Environ Microbiol 10:265-277.
-
105
-
48.
106
-
-
CHAPTER FOUR
Conclusions and Future Directions
107
-
-
108
-
-
4. Chapter Four: Conclusions and Future Directions
The overall goal of this study was to explore the diversity and dynamics of
extrachromosomal elements (ECEs) within the context of ecologically cohesive host
populations. By using two model systems, a Vibrio cholerae population from a brackish
water pond and several Vibrio populations co-existing in the marine coastal environment,
we have found that ECEs are prevalent and diverse in Vibrio populations in the wild and
include different types of plasmids and temperate phage. The distribution of ECEs is
correlated to the environmental distribution indicating that host ecology may influence
ECE biology. In addition, we have shown for the first time that non-transmissible ECEs
are most common among Vibrio ECEs and may have been transferred recently and
frequently, despite the fact that they lack genetic components typically found in
mobilizable or conjugative ECEs. Finally, the gene pool of the ECEs was found to be
highly dynamic with high levels of recent recombination among different types of ECEs.
4.1. Study on Vibrio cholerae model
A large collection of Vibrio cholerae isolates from a brackish water pond in Massachusetts
were found to contain a novel filamentous phage VCY at very high prevalence with -40%
of cells being infected. These phages are part of the Inoviridaethat replicate by extrusion,
a process that does not kill the host cells. The phage occurred in both the host genome
integrative form (IF) and the plasmid-like replicative form (RF). The high prevalence of
the phage strongly suggest a potential impact on the host population. This view is further
strengthened by our finding that the frequency of the two forms of VCY+ differ in strains
collected from lagoon and the pond of the brackish water system with the replicative form
being much more prevalent in the latter. These findings suggest that filamentous phage can
109
-
-
be an important component of the environmental biology of V. choleraeand are not limited
to pathogenic strains.
However, several questions remain for further investigations. One question is whether the
uneven distribution of the two forms of VCY from the Lagoon and the Pond samples
might be due to ecological differences in the two locations. In particular, differences in salt
concentration, pH and temperature may have an impact on the physiology of Vibrio
cholerae. Hence the influence of these factors should be investigated to address whether
they lead to any differences in distribution and induction of VCY . Integration of VCY
into the host chromosomes may involve enzymes and potential cofactors that catalyze and
facilitate the process. We suggest an approach where genome-wide gene expression
patterns are analyzed in conjunction with phage dynamics. Alternatively, other factors may
also be involved such as intracellular localization, topological structures as well as binding
affinity between the phage DNA and the proteins and regulators.
4.2. Study on ecologically and genetically cohesive Vibrio model
Our study is the first to perform large scale screening of ECEs in marine bacteria using
Vibrio as a model. Identification of 187 ECEs in the 660 strains analyzed indicates that
ECEs are prevalent in Vibrio. These ECEs occur in diverse types, including mobilizable,
non-transmissible and conjugative plasmids, and bacteriophages. These ECEs are highly
dynamic, predominantly having high ecological and evolutionary turnover. We also found
that these ECEs are mostly enriched in isolates obtained from free-living rather than
particle associated fractions, suggesting correlation between ECEs presence and lifestyle.
Our data also show that non-transmissible ECEs appear to be most common among Vibrio
and may have been transferred recently and frequently, thus indicating that the name for
-
- 110
these types of ECEs is a misnomer. However, potential mechanisms of transfer remain to
be uncovered. Finally, by assessing the genetic components of ECEs, we have found that
ECEs are evolutionarily highly dynamic and subject to frequent gene exchange and
rearrangement.
The work presented here opens several questions for future investigations. Like in the V.
choleraeexample, environmental factors may select for different ECE behavior. However,
our understanding of relevant factors that differentiate particle-associated and free-living
host populations are still incomplete. Nonetheless, environmental conditions may be a
factor with the free-living fraction being more variable both in composition and
concentration of nutrients. It therefore might be speculated that a relatively unstable
ecological niche may benefit from a dynamic pool of ECEs that could offer a source of
additional genes, whereas ECEs may be less needed or beneficial in a relatively more stable
niche such as particles.
A further important question is why the so-called non-transmissible ECEs appear to be
subject to frequent transfer, at least on par with plasmids that have genes encoding for
transfer mechanisms while non-transmissible ECEs do not encode elements and
machineries that are typically employed by mobilizable or conjugative plasmids or phages.
Therefore, one could argue that they may (1) represent new bacteriophages possessing a
novel infection mechanism, (2) be transferred through a mechanism yet to be discovered,
or (3) have been transferred through transformation. To determine whether they are a group
of new temperate phages, induction experiments could be performed and if phage particles
were obtained, these could be collected and their sequence compared to the collection of
ECE elements. Interestingly, evidence from other studies has shown that ECEs can be
111-
packaged into small vesicles to be secreted from donor cells and taken up by recipient cells.
In our collection, we can use electron microscopy to first identify these vesicles that could
potentially carry ECEs and then perform genomic sequence analysis on collected particles
to determine the types of the ECEs they might contain.
Taken together, our results show that ECEs are prevalent in natural Vibrio populations and
that environmental factors may contribute to their diversity and distribution in distinct
ecological niche. Although many questions remain to be addressed in further investigation,
these results have shed light on our understanding of horizontal gene transfer and their roles
-
112
-
in microbial evolution.