Distribution of Chromosome Length Variation in Natural Isolates of

advertisement
Distribution of Chromosome Length Variation in Natural Isolates of
Escherichia coli
Ulfar Bergthorsson and Howard Ochman
Department of Biology, University of Rochester
Large-scale variation in chromosome size was analyzed in 35 natural isolates of Escherichia coli by physical
mapping with a restriction enzyme whose sites are restricted to rDNA operons. Although the genetic maps and
chromosome lengths of the laboratory strains E. coli K12 and Salmonella enterica sv. Typhimurium LT2 are highly
congruent, chromosome lengths among natural strains of E. coli can differ by as much as 1 Mb, ranging from 4.5
to 5.5 Mb in length. This variation has been generated by multiple changes dispersed throughout the genome, and
these alterations are correlated; i.e., additions to one portion of the chromosome are often accompanied by additions
to other chromosomal regions. This pattern of variation is most probably the result of selection acting to maintain
equal distances between the replication origin and terminus on each side of the circular chromosome. There is a
large phylogenetic component to the observed size variation: natural isolates from certain subgroups of E. coli have
consistently larger chromosomes, suggesting that much of the additional DNA in larger chromosomes is shared
through common ancestry. There is no significant correlation between genome sizes and growth rates, which counters the view that the streamlining of bacterial genomes is a response to selection for faster growth rates in natural
populations.
Introduction
Escherichia coli and Salmonella enterica are closely related species of enteric bacteria that diverged an
estimated 120 to 160 MYA (Ochman and Wilson 1987).
Comparisons of the genetic maps of the best characterized representatives of these species, E. coli K12 and S.
enterica sv. Typhimurium LT2, reveal extensive conservation in the order and spacing of mapped loci (Riley
and Krawiec 1987). Moreover, the chromosome sizes of
these two strains, as estimated by physical mapping procedures, are very similar: the E. coli K12 chromosome
is 4.6 Mb in length (Kohara, Akiyama, and Isono 1987;
Smith et al. 1987), whereas that of Typhimurium LT2 is
4.8 Mb (Liu and Sanderson 1992). Although these findings suggest that the structure of bacterial genomes is
evolutionarily conserved, the range of variation in genome size among natural isolates of E. coli greatly exceeds that observed between E. coli K12 and Typhimurium LT2 (Brenner et al. 1972; Bergthorsson and
Ochman 1995). Genome sizes of natural isolates of E.
coli can vary by as much as 650 kb (Bergthorsson and
Ochman 1995), and among serovars of S. enterica (Enteriditis, Paratyphi, Typhi, and Typhimurium), chromosome sizes can differ by 300 kb (Liu, Hessel, and Sanderson 1993).
Despite the overall correspondence in the size and
organization of the E. coli K12 and Typhimurium LT2
chromosomes, alignments of their physical and genetic
maps have revealed several large regions confined to
only one species. These regions—termed ‘‘chromosomal
loops’’ (Riley and Krawiec 1987)—are distributed
throughout the chromosome and cumulatively account
Abbreviations: PFGE, pulsed-field gel electrophoresis; LEE, locus
of enterocyte effacement; PCR, polymerase chain reaction; sv., serovar.
Key words: chromosome size variation, Escherichia coli, genome
evolution, physical mapping, pulsed field gel electrophoresis.
Address for correspondence and reprints: Howard Ochman, Department of Biology, University of Rochester, Rochester, New York
14627. E-mail: ochman@ho.biology.rochester.edu.
Mol. Biol. Evol. 15(1):6–16. 1998
q 1998 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038
6
for more than 10% of the DNA in each species. Analysis
of the base composition of sequenced genes from E. coli
suggests that as much as 15%, or 700 kb, of the E. coli
K12 genome may have been acquired by transfer from
foreign sources and that perhaps 30 kb of foreign DNA
is acquired every million years (Lawrence and Ochman
1997).
Although the evolution of E. coli and S. enterica
has been marked by the acquisition and deletion of large
regions of DNA, the chromosomes of these species remain symmetric in the sense that approximately equal
distances are maintained between the replication origin
and terminus on each side of the circular chromosome.
Selection to maintain chromosome symmetry is thought
to preserve the order of genes by reducing rearrangements, and there is some support for this notion from
experimental populations of E. coli in which the deleterious effect of an inversion is related to the resulting
asymmetry in the distance between the origin and terminus (François et al. 1990; Hill, Harvey, and Gray
1990; Riley and Sanderson 1990). Furthermore, in two
naturally occurring isolates of S. enterica, inversions
have apparently compensated for asymmetries introduced by large chromosomal insertions (Liu and Sanderson 1995a, 1995b). Thus, it appears that patterns of
change in bacterial genomes are affected by natural selection to maintain chromosome symmetry.
In this paper, we address several issues concerning
the large degree of variation in genome size observed
among naturally occurring strains of E. coli:
1. How much of the total variation in total genome size
can be attributed to chromosomal DNA? The previously published estimates of genome size in natural
isolates of E. coli were based on the cumulative
lengths of all restriction fragments resolved on
pulsed-field gels, and in that study, the assignment of
length variation to particular chromosomal regions,
or even to extrachromosomal elements, was impeded
by the highly variable restriction fragment patterns
present in natural populations.
Chromosome Size Variation in E. coli
2. Is size variation distributed randomly over the chromosome, or are specific regions prone to acquire and
delete DNA? Laboratory isolates of E. coli exhibit
more structural variation close to the replication terminus (Perkins et al. 1993), perhaps due to higher
recombination rates in this region. Based on these
studies, we might also expect the termination region
to be more variable in natural isolates; however, in
S. enterica, a region near the replication origin accounts for most of the variation in chromosome
length (Liu, Hessel, and Sanderson 1993).
3. What does the pattern of chromosome variation in
nature reveal about selective constraints acting on
chromosome organization? For example, are differences in chromosome length among strains symmetrically distributed around the replication origin as expected if natural selection acts to keep equal distances between the origin and terminus of replication?
4. What is the absolute rate of chromosome evolution
in enteric bacteria? The original comparisons of E.
coli and Typhimurium suggest that chromosome size
is stable over long evolutionary periods.
Applying the homing endonuclease I-CeuI (Marshall and Lemieux 1992), whose restriction sites occur
only in rDNA operons, we have investigated the distribution of length variation over the entire chromosome
in natural isolates of E. coli. In this study, we establish
that the range of variation in chromosome length is in
fact larger than that of overall genome size as previously detected among natural isolates of E. coli, and that
these size differences are distributed symmetrically with
respect to the replication origin and terminus.
Materials and Methods
Bacterial Strains
We selected 35 strains of E. coli from the ECOR
reference collection (Ochman and Selander 1984),
which includes natural isolates from a wide variety of
hosts and geographic regions. Phylogenetic relationships
among these strains have been inferred from variation
at 38 polymorphic loci as detected by enzyme electrophoresis (Herzer et al. 1990).
Restriction Endonuclease Digestion
Agarose plugs containing intact genomic DNA
were prepared as previously described (Bergthorsson
and Ochman 1995). Approximately 50 ml of each agarose plug (containing 10 ng/ml DNA) was digested overnight at 378C with 0.3 U of I-CeuI (NEB) in 50 ml of
restriction enzyme buffer. Partial digests were generated
by overnight digestion with 0.05 units of enzyme.
PFGE and Physical Mapping
Approximately 15 ml of an agarose plug was inserted into a 0.9% agarose gel and subjected to electrophoresis in 0.5 3 TBE at 148C in a CHEF-DR II pulsedfield gel box (Bio-Rad Laboratories, Richmond, Calif.).
To separate fragments smaller than 1,000 kb, electrophoresis proceeded for 24 h at 180 V with pulse times
varying according to the intended range of resolution
7
(Bergthorsson and Ochman 1995). To resolve the largest
I-CeuI restriction fragment, which is typically over
2,500 kb in length, samples were electrophoresed in a
0.7% agarose gel for 120 h at 60 V with pulse times
ramped from 10 to 16 min over the course of the run.
Gels were stained in 0.01% ethidium bromide and photographed under UV light. Lambda ladder, low range
PFG marker (NEB), and chromosomes from Saccharomyces cerevisiae and Hanensula wingei (Bio-Rad) were
used as molecular size markers.
Size estimates of H. wingei chromosomes, as provided by the supplier, were used with one exception. Our
comparisons of H. wingei chromosomes with I-CeuI
fragment A of E. coli K12 (which is 2,450 kb in length)
yielded a size for H. wingei chromosome VI of 2.5 Mb
instead of 2.7 Mb, as reported by the supplier.
In cases where we did not observe consistent differences in the sizes of particular I-CeuI fragments across
multiple runs, fragment size estimates were considered to
be equal and pooled in the calculation of average size and
its standard error. (This pooling was not applied in the
estimates of fragment A where sizes and standard errors
were based on one to four runs for each isolate.)
The relative chromosomal position of each I-CeuI
fragment was established by partial digests, and the
identity of fragments was determined from Southern
blots of I-CeuI-digested DNA, which were probed with
genes whose locations are known on the E. coli K12
chromosome. Loci used in identification of I-CeuI fragments by Southern hybridization were: dif at 34.39 for
fragment A; rpoS at 61.89 for fragment B; oriC at 84.69
for fragment C; uvrD at 86.29 for fragment D; glnLG at
87.39 for fragment E; and ileS at 0.59 for fragment G.
When referring to a particular I-CeuI fragment, we are
denoting the pulse-field gel band that hybridized to the
corresponding probe. Map positions of these probes are
based on the nucleotide sequence of the E. coli K12
chromosome (Blattner et al. 1997, GenBank accession
number U00094).
To confirm fragment identity, PCR assays were performed on DNA recovered from I-CeuI fragments originally recognized as B, C, and G. Primers for genes at
the opposite ends of fragments B, C, and G on the E.
coli K12 chromosome were developed based on the nucleotide sequence of the E. coli K12 chromosome (Blattner et al. 1997). The assayed genes were as follows:
clpB at 58.99 and yhdZ at 72.79 for I-CeuI fragment B;
aroE at 73.99 for I-CeuI for fragment C; and metA at
90.89 and proS at 4.79 for fragment G. Small samples
of agarose containing the appropriate I-CeuI restriction
fragment were removed from a gel with a Pasteur pipette
and melted in 30 ml ddH2O, and 1 ml of this preparation
was used as template in the PCR. Amplification reactions proceeded for 25 cycles of 948C for 1 min, 588C
for 1 min, and 728C for 1 min.
Plasmid Analysis
Plasmid DNA was isolated by the method of Kado
and Liu (1981), digested with a rare-cutting restriction
enzyme (Bln I, Not I, Sfi I, or Xba I), and resolved by
PFGE as described above.
8
Bergthorsson and Ochman
Statistical Analysis
Differences in average chromosome size among
subspecific groups of E. coli were tested by a single
classification analysis of variance (Sokal and Rohlf
1981, p. 210).
Chromosome symmetry and distribution of length
variation with respect to the replication origin and terminus were examined by comparing the cumulative
lengths of fragments flanking each side of the replication
origin. To eliminate nonindependence of data points due
to the common ancestry of isolates, we applied the
method of phylogenetic contrasts proposed by Felsenstein (1985). The analysis was performed with programs
written by Garland et al. (1993) and Martins (1996) and
applied to data extracted from the neighbor-joining tree
of the genetic relationships among strains (Herzer et al.
1990). Because the absolute values of standardized contrasts are inversely related to lengths of branches on the
tree, the branch lengths were log-transformed prior to
standardization and subsequent calculations (Garland,
Harvey, and Ives 1992). To test whether the correlations
were sensitive to uncertainties in the phylogenetic relationships among strains, phylogenetic contrasts were
performed on 1,000 random trees (Martins 1996) and on
an alternative tree for the ECOR collection based on
random amplified DNA sequences (Desjardins et al.
1995).
To test whether regions of the chromosome differ
in degree of variability, we compared the variances in
the lengths of different chromosomal regions based on
I-CeuI fragments. The variance in the length of a given
region is expected to increase linearly with fragment
size, assuming that the number of insertions, duplications and deletions per fragment increases with fragment
length. Therefore, the variances in the sizes of the ICeuI fragments were standardized by dividing the variance of a given fragment by its average size before subjecting the values to pairwise F-tests to test the equality
of variances (Sokal and Rohlf 1981, p. 185). Critical
significance values of pairwise F-tests were adjusted
with a sequentially rejective Bonferroni procedure for
multiple comparisons (Holm 1979; Rice 1989).
Copy numbers of IS elements in ECOR strains
were taken from Sawyer et al. (1987), Hall et al. (1989)
and Lawrence, Ochman and Hartl (1992), and information on growth rates was from Mikkola and Kurland
(1991).
FIG. 1.—Locations of I-CeuI recognition sites on the E. coli K12
chromosome. I-CeuI cleaves at the seven rrn genes, whose map positions are indicated. The resulting restriction fragments are designated
A through G.
indicates that this site is not interrupted by the LEE or
any other piece of DNA. Amplification reactions were
performed for 25 cycles at 948C for 1 min, 508C for 1
min, and 728C for 1 min.
Results
Variation in Chromosome Size Among Natural
Isolates of E. coli
All 35 ECOR isolates produced seven fragments
after digestion with I-CeuI, indicating that the number
of rrn operons is conserved among strains of E. coli.
Following the convention of Liu, Hessel, and Sanderson
(1993), these fragments were designated A through G
(fig. 1). Based on the cumulative sizes of these fragments for each strain, natural isolates of E. coli can differ by over 1 Mb in the lengths of their chromosomes,
with sizes ranging from 4,500 to 5,520 kb (table 1 and
fig. 2).
PCR
Differences Between Subgroups of E. coli
The phylogenetic tree of Herzer et al. (1990) displays five major subspecific groups within E. coli. We
detected significant differences in chromosome size between these subgroups of E. coli (F 5 6.8, P , 0.001).
The laboratory strain E. coli K12 has a chromosome size
of only 4.6 Mb and is most closely related to strains
from subgroup A, which contains strains with the smallest chromosomes. Strains with the largest chromosomes
are found in subgroups B2, D, and E of the ECOR collection.
The chromosomes of certain strains of E. coli have
integrated large regions called pathogenicity islands.
The frequency of a 35-kb pathogenicity island—the locus of enterocyte effacement (LEE)—was assessed by
the PCR using primer sequences published in McDaniel
et al. (1995). Primers K255 and K260 flank the right
junction of the LEE (K260 is outside the locus), and
K295 and K296 flank the left junction (K295 is outside).
PCR products from reactions using the two primer pairs,
K255 and K260, and K295 and K296, denote the presence of the LEE at this site, whereas a reaction product
of 527 bp from the flanking primers (K260 and K261)
Changes in Different Chromosomal Regions are
Correlated
The most striking feature of the size variation in ICeuI restriction fragments is that the sizes of fragments
to the left of the replication origin and those to the right
are strongly related (fig. 2). The correlation coefficient
between the cumulative size of the BC region, which
proceeds counterclockwise from the replication origin to
the terminus, and that of the DEFG region, which proceeds clockwise, is highly significant (r 5 0.83, P ,
0.001), denoting symmetry around the origin of repli-
Chromosome Size Variation in E. coli
9
Table 1
Sizes of I-CeuI Restriction Fragments and Total Chromosome Size (kb) of 35 Natural Isolates of Escherichia coli
I-CEUI FRAGMENTSb
SUBGROUPa
Group A . . . .
STRAIN
ECOR 4
ECOR 5
ECOR 11
ECOR 13
ECOR 14
ECOR 15
ECOR 18
ECOR 19
ECOR 20
ECOR 21
ECOR 23
Group B1 . . .
ECOR 27
ECOR 28
ECOR 29
ECOR 34
ECOR 58
ECOR 68
ECOR 71
ECOR 72
Group B2 . . .
ECOR 51
ECOR 56
ECOR 57
ECOR 60
ECOR 61
ECOR 62
ECOR 63
ECOR 64
ECOR 65
Group D . . . .
ECOR 36
ECOR 38
ECOR 39
ECOR 40
Group E. . . . .
ECOR 31
ECOR 37
ECOR 42
a
Ad
2,585
(35.0)
2,940
(10.0)
2,750
*
2,485
(12.0)
2,645
(45.0)
2,690
(55.0)
2,510
(15.3)
2,480
(11.6)
2,505
(25.0)
2,505
(25.0)
2,675
(5.0)
2,600
(33.3)
2,620
(27.5)
2,610
(60.0)
2,500
*
2,700
(25.3)
2,745
(25.0)
2,650
*
2,635
(65.0)
2,750
(50.0)
2,590
(54.9)
2,860
(17.6)
2,580
(8.8)
2,505
(15.0)
2,585
(15.0)
2,700
(25.0)
2,775
(165.0)
2,500
(0.0)
2,710
(26.5)
2,800
*
2,780
(80.0)
2,845
(35.0)
2,775
(35.0)
3,100
(0.0)
2,735
(35.0)
B
C
D
E
F
G
SIZE (kb)c
707
(3.4)
743
(3.9)
824
(5.7)
680
(3.4)
735
(4.5)
735
(4.5)
699
(3.8)
699
(3.8)
654
(8.8)
654
(8.8)
807
(3.8)
707
(3.4)
743
(3.9)
787
(2.6)
790
(5.6)
743
(3.9)
843
(5.5)
771
(8.0)
771
(8.0)
810
(5.6)
824
(5.7)
810
(5.6)
790
(5.6)
776
(12.4)
843
(5.5)
873
(7.3)
810
(5.6)
787
(2.6)
824
(5.7)
807
(3.8)
787
(2.6)
807
(3.8)
743
(2.7)
787
(2.6)
743
(2.7)
527
(2.0)
515
(2.5)
556
(4.0)
515
(2.5)
608
(3.3)
575
(6.1)
515
(2.5)
527
(2.0)
480
(0)
480
(0)
532
(3.0)
515
(2.5)
527
(2.0)
527
(2.0)
515
(2.5)
515
(2.5)
532
(3.0)
547
(5.4)
532
(3.0)
550
(2.9)
550
(2.9)
550
(2.9)
608
(3.3)
527
(2.0)
527
(2.0)
550
(2.9)
581
(2.1)
527
(2.0)
556
(4.0)
616
(1.8)
581
(2.1)
616
(1.8)
547
(5.4)
581
(2.1)
616
(1.8)
90
(0.6)
90
(0.6)
90
(0.6)
90
(0.6)
90
(0.6)
90
(0.6)
90
(0.6)
90
(0.6)
90
(0.6)
90
(0.6)
90
(0.6)
90
(0.6)
94
(0.7)
94
(0.7)
94
(0.7)
94
(0.7)
94
(0.7)
90
(0.6)
94
(0.7)
112
(1.1)
112
(1.1)
112
(1.1)
112
(1.1)
112
(1.1)
112
(1.1)
112
(1.1)
112
(1.1)
104
(1.0)
104
(1.0)
104
(1.0)
104
(1.0)
104
(1.0)
94
(0.7)
94
(0.7)
94
(0.7)
166
(1.2)
128
(1.2)
128
(1.2)
128
(1.2)
128
(1.2)
138
(0.6)
122
(0.5)
122
(0.5)
122
(0.5)
122
(0.5)
138
(0.6)
143
(0.9)
128
(1.2)
138
(0.6)
138
(0.6)
136
(1.8)
138
(0.6)
138
(0.6)
138
(0.6)
138
(0.6)
138
(0.6)
138
(0.6)
138
(0.6)
166
(1.2)
166
(1.2)
138
(0.6)
175
(1.2)
138
(0.6)
138
(0.6)
143
(0.9)
143
(0.9)
143
(0.9)
138
(0.6)
175
(1.2)
143
(0.9)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
38
(0.3)
43
(0.9)
38
(0.3)
43
(0.9)
38
(0.3)
38
(0.3)
38
(0.3)
608
(3.3)
699
(3.3)
735
(4.5)
639
(2.1)
707
(3.4)
639
(2.1)
608
(3.3)
639
(2.1)
608
(3.3)
608
(3.3)
680
(3.4)
616
(1.8)
639
(2.1)
639
(2.1)
680
(3.4)
639
(2.1)
807
(3.8)
654
(8.8)
680
(3.4)
810
(5.6)
707
(3.4)
810
(5.6)
824
(5.7)
776
(8.3)
790
(5.6)
807
(3.8)
743
(2.7)
707
(3.4)
743
(2.7)
807
(3.8)
713
(6.3)
787
(2.6)
735
(4.5)
743
(2.7)
699
(3.3)
4,720
5,150
5,120
4,580
4,950
4,910
4,580
4,600
4,500
4,500
4,960
4,710
4,790
4,830
4,760
4,870
5,200
4,890
4,890
5,210
4,960
5,320
5,090
4,900
5,060
5,220
5,230
4,800
5,110
5,320
5,150
5,350
5,070
5,520
5,070
The E. coli subgroup designations are derived from the neighbor-joining tree of Herzer et al. (1990). The phylogenetic positions of subgroups are shown in figure 2.
See figure 1 for positions of I-CeuI fragments. All fragment lengths are given in kb, and standard errors are in parentheses below the fragment size estimates.
Chromosome size is computed as the sum of all I-CeuI fragment lengths rounded to the nearest 10 kb.
d An asterisk indicates that the size of the fragment was measured only once, and there is no estimate of standard error.
b
c
10
Bergthorsson and Ochman
FIG. 3.—Relationship between lengths of chromosomal regions
flanking the replication origin in natural isolates of E. coli. Region left
of the origin comprises I-CeuI fragments B and C, located between
56.29 and 84.69 on the E. coli K12 chromosome. (Although the replication origin lies within fragment C, .90% of this fragment is located counterclockwise to the origin.) The region right of the origin
comprises fragments D, E, F, and G, which map between 84.69 and
5.19 on the E. coli K12 chromosome (r 5 0.83, P , 0.001, df 5 33).
be expected if chromosomal insertions and deletions
were independent events.
FIG. 2.—Linear representation of I-CeuI maps for 19 natural isolates of E. coli and E. coli K12. Letters A through G along the bottom
correspond to I-CeuI fragments, and the megabase scale at the top
shows chromosome lengths. Genetic relationships are inferred from
variation at 38 enzyme loci (Herzer et al. 1990). Numbers in this tree
represent ECOR strain designations, and letters represent major subgroups within E. coli. The shaded box depicts the range of variation
in chromosome size between E. coli K12 and the ECOR strain with
the largest chromosome.
cation (fig. 3). The corresponding correlation between
the region surrounding the origin of replication, fragments B through G, and the region flanking the terminus
represented by fragment A is much lower but still statistically significant (r 5 0.52, P , 0.01).
Because the strains used in this analysis are phylogenetically related, the high degree of significance in
the correlation between the lengths on either side of the
origin may be inflated due to changes in a few ancestral
strains. Using phylogenetically independent contrasts,
the association between changes on both sides of the
replication origin remains significant (fig. 2b, r 5 0.58,
P , 0.01) (fig. 4), and the 99% confidence limits on the
distribution of correlation coefficients generated by
1,000 random phylogenetic trees do not include the zero.
The correlation between the lengths of the region that
spans the replication origin and that spanning the terminus is also significant using the phylogenetic correction (r 5 0.39, P , 0.05). These correlations would not
Variation Among Chromosomal Regions
With the exception of the smallest fragment (F), all
I-CeuI fragments show considerable length variation
among natural isolates of E. coli (table 1, fig. 2). For
example, I-CeuI fragment G ranges from 608 to 824 kb
in length, overlapping in size with fragments B (654 to
873 kb) and fragment C (480 to 616 kb). Therefore, the
identity of each fragment was determined by Southern
hybridizations to genes of known position on the E. coli
K12 chromosome. Probes to oriC and dif were used to
ascertain that the replication origin and terminus were
located on fragments C and A, respectively. PCR assays
of clpB, yhdZ, aroE, metA, and proS confirmed the identity and map position of the I-CeuI fragments. Although
these genes were generally recovered from the expected
fragments (based on size and Southern hybridizations),
there were a few exceptions: proS was amplified from
fragment B instead of fragment G of ECOR 57, and
yhdZ was detected on fragment G but not on fragment
FIG. 4.—Bivariate plot of phylogenetically independent contrasts
of lengths of regions flanking the replication origin. The neighborjoining tree by Herzer et al. (1990) is used to remove covariation due
to common ancestry of isolates.
Chromosome Size Variation in E. coli
FIG. 5.—Relationship between the variances of different I-CeuI
fragments (A through G) standardized by their average size and the
average size of the corresponding fragments. Standardization should
remove the linear relationship between the variance and the average
fragment size; however, a significant relationship still exists. Fragment
G exhibits larger variation than expected for a fragment of this size.
B in ECOR 61. These cases indicate the occurrence of
either large-scale chromosomal inversions or translocations in these strains. There were also instances in which
amplification from the isolated fragments failed, although PCR assays on total genomic DNA indicated that
the gene was present in these strains. We did not detect
metA in fragments B, C, or G in ECOR 11 from subgroup A or ECOR 51, 56, 57, 61, 62 and 65 from subgroup B2. Furthermore, proS was not detected in these
same fragments in ECOR 61, 65, and 72.
To test whether certain regions of the chromosome
exhibit different levels of length variation, we compared
the standardized variances for each fragment across
strains in pairwise F-tests. There were no significant differences in degree of variation among the larger fragments, A, B, and G; however, fragments G and A are
significantly more variable than fragment C (whereas B
is not). The variance in the size of the smallest I-CeuI
fragment (F) is significantly less than that of all other
I-CeuI fragments, with its length ranging from 38 to 42
kb among the natural isolates.
There is a significant correlation between I-CeuI
fragment size and the standardized variance of each
fragment (r 5 0.85, P , 0.05, df 5 5) (fig. 5). This
relationship could, in principle, be explained by a positive correlation between chromosomal changes occurring within each region, which would increase the expected variance of each fragment. (If we suppose, for
example, that each I-CeuI fragment consists of two
smaller parts, the expected variance for the whole fragment would equal sum of the variances of each part plus
twice the covariance of the parts. Therefore, standardized variances that are higher than expected could result
from correlation of changes within a fragment.)
Alternatively, the correlation between I-CeuI fragment size and its standardized variance could result from
the positions of particular I-CeuI fragments around the
chromosome. Smaller I-CeuI fragments are clustered
close to the replication origin, and if chromosome organization is more conserved closer to the replication
origin, we might expect that the standardized variance
in length is related to fragment length. And in fact, there
11
is a significant correlation between the standardized
variance of a fragment and distance from the replication
origin. However, there is not a significant difference in
variability between the half of the chromosome flanking
the replication origin (the sum of fragments B through
G) and that of the terminus (A), which would be expected if constraints on chromosome organization decreased from the origin to the terminus. In sum, this
analysis indicates that the relationship between standardized variance and average fragment size is due to
correlations in size changes over the entire chromosome
rather than to regional conservation closer to the origin
of replication.
Among serovars of S. enterica, I-CeuI fragment
G—corresponding to 90.59 to 5.19 on the E. coli K12
chromosome—accounts for a majority of the total difference in chromosome length between the largest and
the smallest chromosomes (Liu, Hessel, and Sanderson
1993). It appears that this same chromosomal region is
more variable in E. coli. In figure 5, the point for fragment G is well above the regression line of the standardized variance of I-CeuI fragment to average fragment size.
Chromosome Size and Repetitive DNA
Repetitive sequences, such as IS elements, are
widespread in the E. coli genome, and their copy numbers are highly variable among strains (Sawyer et al.
1987). In addition to their direct contribution to genome
size variation, repetitive DNAs can affect the rate of
large-scale genome rearrangements by providing regions
of homology for ectopic recombination and can lead to
duplications, deletions, inversions, translocations, and
the integration of DNA (Umeda and Ohtsubo 1989;
Deonier 1996). IS elements increase the rate of duplications and deletions, but because duplications are very
unstable and frequently revert, and deletions are virtually irreversible, high numbers of IS elements might act
to reduce chromosome sizes. Alternatively, IS elements
could act as sites for integration of plasmids and fragments of foreign DNA into the chromosome, thereby
increasing genome size.
The numbers of chromosomal copies of IS1—the
most numerous IS element in strains from the ECOR
collection—are negatively correlated with chromosome
size (r 5 20.47, P , 0.01). However, when the independent-contrasts method is used to correct for the effect
of covariation due to shared ancestry, the negative association between IS number and genome size disappears. No other IS elements show significant correlation
to genome size.
Chromosome Size Versus Genome Size
There is reasonably good agreement between the
chromosome sizes based on I-CeuI fragment length and
the overall genome sizes, which include chromosomal
and extrachromosomal DNA, as estimated from Not I
and Bln I digests of 14 ECOR isolates (Bergthorsson
and Ochman 1995). There is, however, one notable exception: ECOR 37 is now estimated to have a 5.5-Mb
chromosome, whereas previous estimates yielded a total
12
Bergthorsson and Ochman
Table 2
Chromosome Size and the Sizes of Large Plasmids in 14
Natural Isolates of Escherichia coli
Strain
ECOR
ECOR
ECOR
ECOR
ECOR
ECOR
ECOR
ECOR
ECOR
ECOR
ECOR
ECOR
ECOR
ECOR
4. . . . . . . . .
13 . . . . . . . .
14 . . . . . . . .
15 . . . . . . . .
28 . . . . . . . .
29 . . . . . . . .
37 . . . . . . . .
38 . . . . . . . .
40 . . . . . . . .
51 . . . . . . . .
62 . . . . . . . .
63 . . . . . . . .
68 . . . . . . . .
71 . . . . . . . .
Chromosome
Length
(kb)
4,720
4,580
4,950
4,910
4,790
4,830
5,520
5,320
5,350
5,210
5,060
5,220
5,200
4,890
Sizes of
Large Plasmids
(kb)
92,
87
40
102
170
49,
60,
60,
104
69
65, 85
40
130, 110
90
genome size of approximately 4.9 Mb. This discrepancy
results from at least two previously unresolved comigrating Not I and Bln I restriction fragments, which led
to an underestimation of genome size.
To determine the effect of large plasmids on previous estimates of genome size using Not I and Bln I,
we sampled 14 ECOR strains for the presence of large
plasmids. Consistent with an earlier survey of plasmid
distribution among ECOR strains (Hartl et al. 1986), approximately 70% (10 of 14) of the isolates contained
plasmids over 40 kb in length (table 2). Digestion of
plasmids with Bln I and Not I revealed that in 7 of the
14 isolates (ECOR 13, 14, 15, 28, 29, 62, and 71), plasmids were included in the original estimates of genome
size (Bergthorsson and Ochman 1995). We detected no
significant relationship between chromosome size and
the cumulative sizes of large plasmids for these 14
strains (r 5 20.09), indicating that total DNA content
in enteric bacteria is not under strong stabilizing selection, despite the fact that E. coli K12 and Typhimurium
LT2 have very similar chromosome lengths.
In some instances, the present analysis of plasmids
within these strains resolved certain inconsistencies between the estimates of genome size as derived from digestions with different restriction enzymes. For example, estimates of genome size for ECOR 29 based on
Bln I and Not I digests were 5,121 and 4,763 kb, respectively (Bergthorsson and Ochman 1995). However,
this strain is now known to harbor a 170-kb plasmid
with two Bln I restriction sites and no Not I sites. In
that only linearized plasmids are resolved under our
electrophoretic conditions (Beverly 1988), the original
genome size estimate of this strain calculated from Bln
I digests, which included plasmid fragments, was larger
than that estimated from Not I digests, which did not
include any plasmid-associated fragments. And due to
plasmid copy number, these Bln I fragments corresponded to doublets in the original digests resulting in an
overestimation by some 340 kb. Once these plasmid
fragments are eliminated, the chromosome size estimates for ECOR 29 based on the fragments produced
by Bln I and Not I digests differ by less than 1% and
agree well with values obtained from I-CeuI digests.
Growth Rates
If the rate of cell division is largely limited by replication rate and cell size, large genome sizes are expected to decrease growth rate (Stouthamer and Koojiman 1993). Mikkola and Kurland (1991) measured
growth rates in minimal media for several ECOR strains,
and, when combined with the chromosome size data
presented here, there is a significant negative correlation
between chromosome size and their growth rate estimates (r 5 20.44, n 5 28, P , 0.05; Kendall’s t 5
20.42, P , 0.05). When the phylogenetic relationships
of the ECOR strains have been taken into account, the
correlation coefficient is still significant (r 5 20.44, n
5 26, P , 0.05); however, this correlation is strongly
dependent on one data point (the contrast between
ECOR 56 and 57) and is sensitive to transformations
made on the branch lengths. Furthermore, a nonparametric test of associations between phylogenetic contrasts on chromosome size and growth rate is not significant (Kendall’s t 5 20.18, P . 0.1).
Pathogenicity Islands
Several large clusters of virulence genes have been
discovered in pathogenic strains of E. coli (Hacker et al.
1990; McDaniel et al. 1995). The LEE pathogenicity island promotes attaching and effacing lesions and is inserted at the selC locus of enteropathogenic E. coli
(McDaniel et al. 1995). Using a PCR-based assay, only
ECOR 37 contains a LEE island at the selC locus. There
may be an additional insertion at selC in this strain, because the PCR product of the right-hand LEE junction is
approximately 6 kb in length, compared to 418 bp detected
in LEE-containing enteropathogenic strains of E. coli.
Discussion
Despite the high degree of similarity in the size and
organization of the E. coli K12 and S. enterica sv. Typhimurium LT2 chromosomes, natural isolates of E. coli
display a wide range of variation in chromosome lengths,
with some strains differing by more than 1 Mb. In E.
coli, the average length of a gene is almost 1 kb, and
some 85% of its chromosome is occupied by coding sequences (Burland et al. 1993, 1995). Therefore, the difference between the strains of E. coli with the smallest
and largest chromosomes might involve more than 800
genes. Although the nature of most of this genomic variation is presently unknown, there are three general
sources that could contribute to differences in chromosome length: (1) duplications and deletions; (2) the acquisition of foreign DNA, including the integration of
plasmids and phages; and (3) the accumulation of repetitive DNA, such as insertion sequences and transposons.
The Sources of Chromosome Size Variation
Riley and Labedan (1996) reported that more than
half of the coding sequences in E. coli K12 share regions
of similarity with other genes in the genome, suggesting
that the evolution of the E. coli chromosome involved
Chromosome Size Variation in E. coli
the duplication and subsequent divergence of ancestral
genes. Despite their conclusions, large-scale duplications are probably not an important factor in chromosome size variation observed within E. coli. Although
duplications are relatively common, most are unstable
and are maintained only under strong selection (Sonti
and Roth 1989). Moreover, we did not observe any additional I-CeuI sites, suggesting that little of the chromosome size variation in natural populations of E. coli
originated through duplications.
In some cases, E. coli K12 has more than one gene
coding for the same function, such as the genes coding
for ornithine carbamoyltransferase, argI and argF,
which show 78% nucleotide sequence similarity (Van
Vliet, Boyen, and Glansdorff 1988). However, argF is
located in a region flanked by IS1 elements and has an
unusually high GC content, suggesting that it was
gained by horizontal transfer rather than through the duplication of an existing gene (York and Stodolsky 1981;
Van Vliet, Boyen, and Glansdorff 1988). Therefore,
much of the similarity among E. coli K12 genes detected
by Riley and Labedan (1996) could also have arisen
through the transfer or fusing of orthologous genes (i.e.,
genes that diverged after speciation but were later incorporated into the same genome), rather than through
gene duplications.
Horizontal transfer is likely to be the major source
of chromosome size variation among natural strains. Examinations of the features of sequenced genes of E. coli
K12 suggest that at least 6% (Whittam and Ake 1993),
or as much as 17% (Medigue et al. 1991; Lawrence and
Ochman 1997), of the E. coli K12 genome has atypical
GC contents or codon usage patterns, suggesting that
these genes were acquired through horizontal transfer.
And, based on an extrapolation from the number of recently acquired genes, Lawrence and Ochman (1997)
estimate that the E. coli K12 lineage has gained and lost
more than 3,000 kb since its divergence from S. enterica.
While size variation in eukaryotic genomes is
largely attributable to repetitive DNA, prokaryotic genomes consist primarily of single-copy sequences. We
expect no simple relationship between genome size and
the amount of multicopy DNA in bacteria because transposable sequences, such as IS elements, promote rearrangements that can either decrease or increase genome
size. The number of IS elements is highest in strains of
E. coli with the smallest chromosomes; however, we detected no significant correlation between number of insertion sequences and chromosome size when the phylogenetic relationships of the strains were taken into account.
Distribution of Length Variation in Bacterial
Chromosomes
In some species of bacteria, alterations in chromosome structure map to specific portions of the chromosome. For example, in Bacillus cereus, where chromosome sizes range from 2.4 to 5.3 Mb (Carlson and
Kolstø 1994), a single highly variable region is present
only in strains with larger chromosomes, and in Pseu-
13
domonas aeruginosa, the chromosome can be subdivided into two segments—a stable auxotroph-rich region
encompassing the origin of replication and a variable
auxotroph-poor region flanking the terminus (Römling,
Greipel, and Tümmler 1995).
Comparisons of physical maps of laboratory isolates descended from E. coli K12 reveal a disproportionately large number of chromosome rearrangements
near the replication terminus, presumably due to higher
rates of recombination in this region (Perkins et al.
1993). This is also consistent with the notion that the
termination region contains fewer essential genes (Henson and Kuempel 1985) and can tolerate the accumulation of alterations. However, the density of genes near
the terminus appears to be similar to that of the rest of
the chromosome (Moir et al. 1992), and in natural isolates of E. coli, we found that the region surrounding
the replication terminus was not more variable in size
than the region surrounding the replication origin. In
fact, strains with larger genomes display an increase in
the sizes of the majority of their I-CeuI restriction fragments, suggesting that the acquisition and deletion
events leading to the variation among natural isolates of
E. coli are distributed, and correlated, throughout the
chromosome.
In Salmonella, most variation in chromosome
length is associated with a single region (I-CeuI fragment G) which corresponds to a 670-kb region at 909
to 59 on the E. coli K12 physical map (Liu, Hessel, and
Sanderson 1993). The same chromosomal region is also
highly variable in natural isolates of E. coli, perhaps due
to the presence of conserved sequences that act as sites
for recombination. For example, this region contains
several tRNA genes, such as the leucyl-tRNA at 979,
which can serve as insertion sites of phages and pathogenicity islands (Cheetham and Katz 1995).
Chromosome Symmetry
The replication origin and terminus of the E. coli
chromosome are diametrically opposed, and in experimental populations of E. coli, inversions that disrupt this
orientation impede cellular growth rates (Riley and Sanderson 1990; Hill, Harvey, and Gray 1990; François et
al. 1990). Moreover, in two serovars of S. enterica—
Typhi and Paratyphi A—large chromosomal insertions
have been accompanied by inversions that restore the
symmetry between the replication origin and terminus
(Liu and Sanderson 1995a, 1995b).
Among natural isolates of E. coli, there is a highly
significant association between the cumulative size of ICeuI fragments clockwise (fragments D through G) and
those counterclockwise (fragments B and C) of the replication origin. This symmetrical distribution of length
variation in natural isolates supports the notion that the
large-scale organization of bacterial chromosomes is, in
part, governed by stabilizing selection which acts to
maintain equal distances on both sides of the chromosome between the origin and terminus. Strains that have
incurred an insertion on one side of the replication origin are more likely to have changes that enlarge the
opposite side. Although these multiple events may not
14
Bergthorsson and Ochman
fully restore chromosome symmetry, they can serve to
compensate for the original change.
It should also be noted that there is also a correlation between the region flanking the origin of replication—fragments B through G—and fragment A,
which contains the replication terminus. This correlation
is also caused by insertions that serve to maintain symmetry between the origin and terminus because the effects of a large insertion on one side of the origin of
replication could be offset by an insertion anywhere on
the opposite side. In principle, any disruption of chromosome asymmetry caused by a large insertion on either side of the chromosome could also be counterbalanced by a similarly large deletion on the same side,
which would eliminate the correlation between the
lengths of the regions flanking the replication origin and
the terminus. Presumably, most chromosome size variation among strains is due to insertions rather than to
deletions, because a large deletion would usually include
essential genes and would be deleterious.
Although we attribute the variation in sizes of ICeuI fragments to the cumulative effects of insertions
and deletions, it is possible that inversions or translocations have caused some of the observed variation.
However, such chromosomal rearrangements do not affect the overall variation in chromosome size and would
contribute negative, not positive, correlations between
the lengths of different chromosomal regions.
The maintenance of chromosome symmetry may
not be the only reason why the lengths of different portions of the chromosome are correlated. Differences between strains in their intrinsic rates of insertions or deletions could introduce correlations in length between
different chromosomal regions and could also account
for the correlation between the standardized variance in
I-CeuI fragment length and the average length of a fragment.
Genome Size and Growth Rate
Mikkola and Kurland (1991) have determined that
most of the variation in growth rates among natural isolates of E. coli is due to differences in translational efficiency. We find no significant association between total
DNA content of natural isolates and the growth rate estimates of Mikkola and Kurland (1991). Therefore, the
variation in growth rates resulting from ribosomal kinetics appears to overwhelm any effects due to DNA
content. This counters the view that bacterial genomes
respond to selection for faster growth rates by streamlining the sizes of their chromosomes. If small genomes
were important for faster growth in nature, we would
expect some association between ribosomal kinetics and
genome size. However, growth rates of E. coli in nature
are only one to two generations per day, very far removed from the maximum growth rate of 20-min doubling time of laboratory strains, and the constraints on
growth rate under natural conditions are very different
from those in culture. In fact, Mikkola and Kurland
(1992) found that natural isolates of E. coli converge on
the growth rate characteristic of laboratory isolates when
allowed to adapt to laboratory conditions for a few hundred generations.
The Ages of E. coli Subgroups and the Rate of
Chromosome Size Divergence
By scaling the genetic divergence among strains
from different subgroups of E. coli to that between E.
coli and S. enterica, it is possible to estimate the rate of
chromosome size evolution within natural populations.
Between isolates of E. coli from subgroup A to those
of subgroups B2 and D (fig. 2), the average divergence
at synonymous sites for six loci is 0.044 for mdh, 0.153
for trpB, 0.114 for trpC, 0.133 for putP, 0.016 for gapA,
and 0.061 for crr (Nelson, Whittam, and Selander 1991;
Hall and Sharp 1992; Nelson and Selander 1992; Boyd
et al. 1994; Milkman 1996). The amount of divergence
between E. coli and S. enterica at the same loci ranges
from 0.252 for gapA to 1.39 for trpC. Assuming that E.
coli and S. enterica diverged 140 MYA and that the rate
of molecular evolution has been uniform within lineages, the dates of divergence between these E. coli subgroups ranges from 8 to 22 MYA.
This estimate is considerably lower than the date
of 80 MYA proposed by Tominaga et al. (1994) for the
divergence of Shigella and E. coli, which was based on
sequence variation of fliC genes. Since Shigella spp. are
a polyphyletic group of strains that arose from within
E. coli, one expects the time of divergence between Shigella and E. coli to be the same as, or less than, that
between subgroups of E. coli. However, fliC of Shigella
has undergone extensive recombination with the flagellin genes from Salmonella, and it is also under diversifying selection (Tominaga et al. 1994; Li et al. 1994),
which increases its level of sequence polymorphism,
leading to an overestimate of divergence time.
Applying these divergence times to the average difference in chromosome size of isolates from subgroups
A and subgroups B2 and D yields a rate of chromosome
size divergence of 16–44 kb/Myr. Based on comparisons
between E. coli K12 and S. enterica sv. Typhimurium
LT2, chromosome size appears to be well conserved,
despite an estimated rate of additions and deletions of
30 kb/Myr (Lawrence and Ochman 1997). Although
chromosome sizes within E. coli are highly variable,
there is a large phylogenetic component to this variation
indicating that chromosome sizes also remain relatively
stable in natural populations of E. coli over long evolutionary periods, which reflects some constraints on absolute chromosome size in addition to those imposed on
chromosome symmetry.
Acknowledgment
This work was supported by NIH grant GM56120.
LITERATURE CITED
BERGTHORSSON, U., and H. OCHMAN. 1995. Heterogeneity of
genome sizes among natural isolates of Escherichia coli. J.
Bacteriol. 177:5784–5789.
BEVERLEY, S. M. 1988. Characterization of the ‘unusual’ mobility of large circular DNAs in pulsed field-gradient electrophoresis. Nucleic Acids Res. 16:925–939.
Chromosome Size Variation in E. coli
BLATTNER, F. R., G. PLUNKETT III, C. A. BLOCH et al. (13 coauthors). 1997. The complete genome sequence of Escherichia coli K-12. Science 277:1453–1462.
BOYD, E. F., K. NELSON, F.-S. WANG, T. S. WHITTAM, and R.
K. SELANDER. 1994. Molecular genetic basis of allelic polymorphism in malate dehydrogenase (mdh) in natural populations of Escherichia coli and Salmonella enterica. Proc.
Natl. Acad. Sci. USA 91:1280–1284.
BRENNER, D. J., G. R. FANNING, F. J. SKERMAN, and S. FALKOW. 1972. Polynucleotide sequence divergence among
strains of Escherichia coli and closely related organisms J.
Bacteriol. 109:953–965.
BURLAND, V., G. PLUNKETT III, D. L. DANIELS, and F. R.
BLATTNER. 1993. DNA sequence and analysis of 136 kilobases of the Escherichia coli genome: organizational symmetry around the origin of replication. Genomics 16:551–
561.
BURLAND, V., G. PLUNKETT III, H. J. SOFIA, D. L. DANIELS,
and F. R. BLATTNER. 1995. Analysis of the Escherichia coli
genome VI: DNA sequence of the region from 92.8 through
100 minutes. Nucleic Acids Res. 12:2105–2119.
CARLSON, C. R., and A.-B. KOLSTØ. 1994. A small (2.4 mb) Bacillus cereus chromosome corresponds to a conserved region of a larger (5.3 mb) Bacillus cereus chromosome. Mol.
Microbiol. 13:161–169.
CHEETHAM, B. F., and M. E. KATZ. 1995. A role for bacteriophages in the evolution and transfer of bacterial virulence
determinants. Mol. Microbiol. 18:201–208.
DEONIER, R. C. 1996. Native insertion sequence elements: Locations, distributions, and sequence relationships. Pp. 2000–
2011 in F. C. NEIDHARDT, R. CURTISS III, J. L. INGRAHAM,
E. C. C. LIN, K. B. LOW, B. MAGASANIK, W. S. REZNIKOFF,
M. RILEY, M. SCHAECHTER, and H. E. UMBARGER, eds.
Escherichia coli and Salmonella typhimurium. Cellular and
molecular biology. 2nd edition. ASM Press, Washington,
D.C.
DESJARDINS, P., B. PICARD, B. KALTENBÖCK, J. ELION, AND E.
DENAMUR. 1995. Sex in Escherichia coli does not disrupt
the clonal structure of the population: Evidence from random amplified polymorphic DNA and restriction-fragmentlength polymorphism. J. Mol. Evol. 41:440–448.
DIMRI, G. P., K. E. RUDD, M. K. MORGAN, H. BAYAT, and G.
F.-L. AMES. 1992. Physical mapping of repetitive extragenic
palindromic sequences in Escherichia coli and phylogenetic
distribution among Escherichia coli strains and other enteric
bacteria. J. Bacteriol. 174:4583–4593.
FELSENSTEIN, J. 1985. Phylogenies and the comparative method. Am. Nat. 125:1–15.
FRANÇOIS, V., J. LOUARN, J.-E. REBOLLO, and J.-M. LOUARN.
1990. Replication termination, nondivisible zones, and
structure of the Escherichia coli chromosome. Pp. 351–359
in K. DRLICA and M. RILEY, eds. The bacterial chromosome. ASM Press, Washington, D.C.
GARLAND, T. JR., A. W. DICKERMAN, C. M. JANIS, and J. A.
JONES. 1993. Phylogenetic analysis of covariance by computer simulation. Syst. Biol. 42:265–292.
GARLAND, T. JR., P. H. HARVEY, and A. R. IVES. 1992. Procedures for the analysis of comparative data using phylogenetically independent contrasts. Syst. Biol. 41:18–32.
HACKER, J., L. BENDER, M. OTT, J. WINGENDER, B. LUND, R.
MARRE, and W. GOEBEL. 1990. Deletions of chromosomal
regions coding for fimbriae and hemolysins occur in vitro
and in vivo in various extraintestinal Escherichia coli isolates. Microb. Pathog. 8:213–225.
HALL, B. G., L. L. PARKER, P. W. BETTS, R. F. DUBOSE, S. A.
SAWYER, and D. L. HARTL. 1989. IS103, a new insertion
15
element in Escherichia coli: characterization and distribution in natural populations. Genetics 121:423–431.
HALL, B. G., and P. M. SHARP. 1992. Molecular population
genetics of Escherichia coli: DNA sequence diversity at the
celC, crr, and gutB loci of natural isolates. Mol. Biol. Evol.
9:654–665.
HARTL, D. L., M. MEDHORA, L. GREEN, and D. E. DYKHUIZEN.
1986. The evolution of DNA sequences in Escherichia coli.
Phil. Trans. R. Soc. Lond. B Biol. Sci. 312:191–204.
HENSON, J. M., and P. L. KUEMPEL. 1985. Deletion of the terminus region (340 kilobase pairs of DNA) from the chromosome of Escherichia coli. Proc. Natl. Acad. Sci. USA
82:3766–3770.
HERZER, P. J., S. INOUYE, M. INOUYE, and T. S. WHITTAM.
1990. Phylogenetic distribution of branched RNA-linked
multicopy single-stranded DNA among natural isolates of
Escherichia coli. J. Bacteriol. 172:6175–6181.
HILL, C. W., S. HARVEY, and J. A. GRAY. 1990. Recombination
between rRNA genes in Escherichia coli and Salmonella
typhimurium. Pp. 335–340 in K. DRLICA and M. RILEY, eds.
The bacterial chromosome. ASM Press, Washington, D.C.
HOLM, S. 1979. A simple sequentially rejective multiple test
procedure. Scand. J. Stat. 6:65–70.
KADO, C. I., and S.-T. LIU. 1981. Rapid procedure for detection
and isolation of large and small plasmids. J. Bacteriol. 145:
1365–1373.
KOHARA, Y., K. AKIYAMA, and K. ISONO. 1987. The physical
map of the whole E. coli chromosome: application of a new
strategy for rapid analysis and sorting of a large genomic
library. Cell 50:495–508.
LAWRENCE, J. G., and H. OCHMAN. 1997. Amelioration of bacterial genomes: rates of change and exchange. J. Mol. Evol.
44:383–397.
LAWRENCE, J. G., H. OCHMAN, and D. L. HARTL. 1992. The
evolution of insertion sequences within enteric bacteria. Genetics 131:9–20.
LI, J., K. NELSON, A. C. MCWHORTER, T. S. WHITTAM, and R.
K. SELANDER. 1994. Recombinational basis of serovar diversity in Salmonella enterica. Proc. Natl. Acad. Sci. USA
91:2552–2556.
LIU, S.-L., and K. E. SANDERSON. 1992. A physical map of the
Salmonella typhimurium LT2 genome made by using XbaI
analysis. J. Bacteriol. 174:1662–1672.
LIU, S.-L., and K. E. SANDERSON. 1995a. Rearrangements in
the genome of the bacterium Salmonella typhi. Proc. Natl.
Acad. Sci. USA 92:1018–1022.
LIU, S.-L., and K. E. SANDERSON. 1995b. I-CeuI reveals conservation of the genome of independent strains of Salmonella typhimurium. J. Bacteriol. 177:3355–3357.
LIU, S.-L., A. HESSEL, and K. E. SANDERSON. 1993. Genomic
mapping with I-CeuI, an intron-encoded endonuclease specific for genes for ribosomal RNA, in Salmonella spp.,
Escherichia coli, and other bacteria. Proc. Natl. Acad. Sci.
USA 90:6874–6878.
MCDANIEL, T. K., K. G. JARVIS, M. S. DONNENBERG, and J.
B. KAPER. 1995. A genetic locus of enterocyte effacement
conserved among diverse enterobacterial pathogens. Proc.
Natl. Acad. Sci. USA 92:1664–1668.
MARSHALL, P., and C. LEMIEUX. 1992. The I-CeuI endonuclease recognizes a sequence of 19 base pairs and preferentially cleaves the coding strand of the Chlamydomonas moewusii chloroplast large subunit rRNA gene. Nucleic Acids
Res. 20:6401–6407.
MARTINS, E. P. 1996. Conducting phylogenetic comparative
studies when the phylogeny is not known. Evolution 50:
12–22.
16
Bergthorsson and Ochman
MÉDIGUE, C., T. ROUXEL, P. VIGIER, A. HÉNAUT, and A. DANCHIN. 1991. Evidence for horizontal gene transfer in Escherichia coli speciation. J. Mol. Biol. 222:851–856.
MIKKOLA, R., and C. G. KURLAND. 1991. Is there a unique
ribosome phenotype for naturally occurring Escherichia
coli? Biochimie 73:1061–1066.
. 1992. Selection of laboratory wild-type phenotype
from natural isolates of Escherichia coli in chemostats.
Mol. Biol. Evol. 9:394–402.
MILKMAN, R. 1996. Recombinational exchange among clonal
populations. Pp. 2663–2684 in F. C. NEIDHARDT, R. CURTISS III, J. L. INGRAHAM, E. C. C. LIN, K. B. LOW, B.
MAGASANIK, W. S. REZNIKOFF, M. RILEY, M. SCHAECHTER,
and H. E. UMBARGER, eds. Escherichia coli and Salmonella
typhimurium. Cellular and molecular biology. 2nd edition.
ASM Press, Washington, D.C.
MOIR, P. D., R. SPIEGELBERG, I. R. OLIVER, J. H. PRINGLE, and
M. MASTERS. 1992. Proteins encoded by the Escherichia
coli replication terminus region. J. Bacteriol. 174:2102–
2110.
NELSON, K., and R. K. SELANDER. 1992. Evolutionary genetics
of the proline permease gene (putP) and the control region
of the proline utilization operon in populations of Salmonella and Escherichia coli. J. Bacteriol. 174:6886–6895.
NELSON, K., T. S. WHITTAM, and R. K. SELANDER. 1991. Nucleotide polymorphism and evolution in the glyceraldehyde3-phosphate dehydrogenase gene (gapA) in natural populations of Salmonella and Escherichia coli. Proc. Natl.
Acad. Sci. USA 88:6667–6671.
OCHMAN, H., and R. K. SELANDER. 1984. Standard reference
strains of Escherichia coli from natural populations. J. Bacteriol. 157:690–693.
OCHMAN, H., and A. C. WILSON. 1987. Evolution in bacteria:
evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26:74–86.
PERKINS, J. D., J. D. HEATH, B. R. SHARMA, and G. M. WEINSTOCK. 1993. XbaI and BlnI genomic cleavage maps of
Escherichia coli K-12 strain MG1655 and comparative
analysis of other strains. J. Mol. Biol. 232:419–445.
RICE, W. R. 1989. Analyzing tables of statistical tests. Evolution 43:223–225.
RILEY, M., and S. KRAWIEC. 1987. Genome organization. Pp.
967–981 in F. C. NEIDHARDT, J. L. INGRAHAM, K. B. LOW,
B. MAGASANIK, M. SCHAECHTER, and H. E. UMBARGER,
eds. Escherichia coli and Salmonella typhimurium: cellular
and molecular biology. ASM Press, Washington, D.C.
RILEY, M., and B. LABEDAN. 1996. Escherichia coli gene products: physiological functions and common ancestries. Pp.
2118–2202 in F. C. NEIDHARDT, R. CURTISS III, J. L. INGRAHAM, E. C. C. LIN, K. B. LOW, B. MAGASANIK, W. S.
REZNIKOFF, M. RILEY, M. SCHAECHTER, and H. E. UMBARGER, eds. Escherichia coli and Salmonella typhimurium.
Cellular and molecular biology. 2nd edition. ASM Press,
Washington, D.C.
RILEY, M., and K. E. SANDERSON. 1990. Comparative genetics
of Escherichia coli and Salmonella typhimurium. Pp. 85–
95 in K. DRLICA and M. RILEY, eds. The bacterial chromosome. ASM Press, Washington, D.C.
RÖMLING, U., J. GREIPEL, and B. TÜMMLER. 1995. Gradient of
genomic diversity in the Pseudomonas aeruginosa chromosome. Mol. Microbiol. 17:323–332.
SAWYER, S. A., D. E. DYKHUIZEN, R. F. DUBOSE, L. GREEN,
T. MUTANGADURA-MHLANGA, D. F. WOLCZYK, and D. L.
HARTL. 1987. Distribution and abundance of insertion sequences among natural isolates of Escherichia coli. Genetics 115:51–63.
SHARP, P. M. 1991. Determinants of DNA sequence divergence
between Escherichia coli and Salmonella typhimurium: codon usage, map position, and concerted evolution. J. Mol.
Evol. 33:23–33.
SMITH, C. L., J. G. ECONOME, A. SCHUTT, S. KLCO, and C. R.
CANTOR. 1987. A physical map of the Escherichia coli K12
genome. Science 236:1448–1453.
SOKAL, R. R., and F. J. ROHLF. 1981. Biometry. Freeman, San
Fransisco.
SONTI, R. V., and J. R. ROTH. 1989. Role of gene duplications
in the adaptation of Salmonella typhimurium to growth on
limiting carbon sources. Genetics 123:19–28.
STOUTHAMER, A. H., and S. A. L. M. KOOIJMAN. 1993. Why
it pays for bacteria to delete disused DNA and to maintain
megaplasmids. Antonie van Leeuwenhoek 63:39–43.
TOMINAGA, A., M. A.-H. MAHMOUD, T. MUKAIHARA, and M.
ENOMOTO. 1994. Molecular characterization of intact, but
cryptic, flagellin genes in the genus Shigella. Mol. Microbiol. 12:277–285.
UMEDA, M., and E. OHTSUBO. 1989. Mapping of insertion elements IS1, IS2, and IS3 on the Escherichia coli K-12 chromosome. Role of the insertion elements in formation of
Hfrs and F9 factors and in rearrangements of bacterial chromosomes. J. Mol. Biol. 208:601–614.
VAN VLIET, F., A. BOYEN, and N. GLANSDORFF. 1988. On interspecies gene transfer: the case of the argF gene of Escherichia coli. Ann. Inst. Pasteur/Microbiol. 139:493–496.
WHITTAM, T. S., and S. E. AKE. 1993. Genetic polymorphisms
and recombination in natural populations of Escherichia
coli. Pp. 223–245 in N. TAKAHATA and A. G. CLARK, eds.
Mechanisms of molecular evolution. Japan Scientific Societies Press, Tokyo.
YORK, M. K., and M. STODOLSKY. 1981. Characterization of
P1argF derivatives from Escherichia coli K12 transduction.
I. IS1 elements flank the argF gene segment. Mol. Gen.
Genet. 181:230–240.
JULIAN P. ADAMS, reviewing editor
Accepted September 29, 1997
Download