Coding potential of the chromosomal segment spanned by

advertisement
SUPPLEMENTARY INFORMATION
Extracting and determining enrichment of pQBR103 DNA for sequencing. pQBR103 was
extracted (Lilley et al., 1994) and further enriched by electrophoresis and excision from low melting5
point agarose gels. DNA was recovered from agarose using phenol extraction and ethanol
precipitation. The enrichment of pQBR103 plasmid DNA relative to P. fluorescens SBW25
chromosomal DNA was determined by PCR targeting the plasmid-encoded merA and merR genes
and the host 16S RNA genes (details of primers available on request). Purified plasmid and host
control DNA were each diluted 100 – 10-6. 25µl PCR reactions were performed using BioLine (UK)
10
reagents with 0.5µl template DNA, 0.2pmol of each primer, 2.0mM dNTP, 1.5mM MgCl 2 and 0.1U/µl
Taq polymerase, and cycling conditions of [3min at 94°C], 30 x [0.5 min at 94°C, 1min at 54°C, 1min
at 72°C], then [10 min at 72°C]. Five microlitre aliquots of the PCR reactions were examined by gel
electrophoresis, and the dilution at which products were not seen was used to estimate the ratio of
plasmid to chromosomal DNA. The final purification of pQBR103 DNA resulted in an enrichment factor
15
of at least 100x which was acceptable for library construction (i.e. ~80% of cloned fragments would be
of pQBR103 origin). Finally, the XbaI restriction pattern of the purified pQBR103 was compared with
that obtained from total SBW25/pQBR103 and SBW25 chromosomal DNA (both isolated by the CTAB
method (Sulakhe et al, 2005) by gel electrophoresis, to confirm that the purified DNA was pQBR103
rather than that of a possible deletion mutant or the result of some other plasmid misidentification.
20
Analysis of plasmid diversity. PCR was used to amplify specific regions from pQBR plasmid DNA
using pQBR103-designed primers (primer details are available on request). PCR conditions are as
described for the estimation of plasmid enrichment, and the results visualized by gel electrophoresis.
The PCR products for CDSs 061, 391, 431, 435 and Int0023 for each pQBR plasmid were recovered
25
from agarose after electrophoresis using QIAquick gel extraction kit (Qiagen, UK) and sequenced
using ABI BigDye V3.1 (Applied Biosystems, Europe) technology.
Microarray probe production. 144 pUC19 clone-probes were chosen from the sequencing libraries
to maximize the coverage of the entire pQBR103 plasmid genome. Of these only 122 were found to
30
amplify an appropriately sized insert using pUC19-specific primers (and 10 random PCR fragments
ISME-J 0070OAR
Page 1
were sequenced to confirm that the inserts were of pQBR103 DNA, data not shown). Subsequently,
the 122 probe regions were found to be distributed throughout pQBR103 and cover 69% of the
genome (details of the probe regions are available on request). Probes of 1415 – 3388 bp were
obtained by PCR amplification of the pUC19 clones using M13f (5’-TGTAAAACGACGGCCAGT-3’)
35
and pUCR (5’-GCGGATAACAATTTCACACAGGA-3’). 50µl PCR reactions were performed using
BioLine reagents with 1µl template DNA, 0.2pmol of each primer, 2.0mM dNTP, 1.5mM MgCl 2, 0.045
U/µl Taq polymerase and 0.005U/µl PFU (Promega, UK), and cycling conditions of [3min at 94°C], 30
x [1 min at 94°C, 1min at 54°C, 7min at 72°C], then [10 min at 72°C]. 5µl aliquots of the PCR reactions
were checked by gel electrophoresis. Successful amplifications were pooled for each clone before
40
purification using GenElute PCR clean-up Kit (Sigma, UK). Probe concentration was estimated by gel
electrophoresis using DNA of known concentration and were adjusted to 100ng/µl. Each probe was
suspended 50:50 in Genetix (Genetix, UK) spotting solution for amine slides and printed onto
aminosilane slides using a Genetix Qarray mini microarray printer with solid tungsten 150 µm aQu
pins. The slide was kept humidified for 12 hours then baked for 30 minutes at 85°C and finally
45
irradiated with 300J UV.
Mapping and analysis of IVET sequences. In vivo expression technology (IVET) had been used
previously to determine regions of transcriptional activity in pQBR103 specifically induced on sugar
beet seedlings (Zhang et al. 2004a). Unique IVET sequences were mapped unambiguously to
50
pQBR103, and these insertions were used to infer likely patterns of CDS expression by CDS cluster
analysis. A cluster was arbitrarily defined as a series of CDS in the same orientation as the IVET dap
gene, in which CDSs upstream and downstream of the IVET insertion point were separated by ≤
200bp or ≤ 500bp (i.e. a run of CDSs likely to be co-expressed by the same RNA transcript reported
by IVET fusion). IVET regions were defined by adjacent IVETs separated by ≤ 5kb (i.e. a gap unlikely
55
to be spanned by the RNA transcript reported by the IVET fusion).
Plasmid and bacterial chromosome genomes used for comparison. The following sequences
were used (accession numbers to the Protein tables are provided; references therein): (Largest
plasmids)
60
pGMI1000MP
(NC_003296),
pSymA
and
pSymB
(NC_003037),
(NC_008043), p42c, p42d, p42e and p42f (NC_007766), pAT
ISME-J 0070OAR
mega
plasmid
(NC_003306), pNGR234a
Page 2
(NC_000914),
pHG1
(NC_005241),
pMLa
(NC_002679),
plasmid
1
(NC_008242),
R478
(NC_005211), and pREL1 (NC_007491); (Pseudomonas plasmids) pCAR1 (NC_004444), p1448A-A
and p1448A-B (NC_007274), pWW0 (NC_003350), pND6-1 (NC_005244), pADP-1 (NC_004956),
pDTG1 (NC_004999), NAH7 (NC_007926), pDC3000A and pDC3000B (NC_004632), pPSR1
65
(NC_005205),
Rms149
(NC_007100),
pPMA4326A
and
pPMA4326B
(NC_005918),
pFKN
(NC_002759) and pRA2 (NC_005909); (Pseudomonas bacterial chromosomes) P. aeruginosa PA01
(NC_002516),
C3719
(NZ_AAKV00000000),
UCBPP-PA14
(NZ_AABQ00000000),
2192
(NZ_AAKW00000000); P. entomophilia L48 (NC_008027); P. fluorescens Pf-5 (NC_004129), Pf0-1
(NC_007492), SBW25; P. putida KT2440 (NC_002947), F1 (NZ_AALM00000000); P. syringae 1448A
70
(NC_005773), B728a (NC_007005), and DC3000 (NC_004578). Data was obtained directly from the
NCBI Protein tables except for P. fluorescens SBW25 which was determined using Artemis and a
preliminary annotation from the Wellcome Trust Sanger Institute. The presence of complete or partial
insertion sequences and transposons (IS/Tn) in sample genomes was assessed by counting the
number of CDS annotated using the word ‘transposase’ (Note that between 1-3 transposase subunits
75
are required to form an active transposase).
80
ISME-J 0070OAR
Page 3
Supplementary Figure Legends
Supp. Figure 1.
The closest homologues of pQBR103 CDSs are from Pseudomonas spp. Of
the 478 CDSs in pQBR103, 95 have significant levels of homology to sequences
found in public databases. The likely origin of these 95 sequences can be inferred
85
from the taxonomy of the closest homologue; the main plot of homology (Expected
value) verses rank demonstrates that the best homologues are from Pseudomonas
spp. (red circles). The inset shows the Expected value for each CDS with
significant homologies in order across the pQBR103 sequence, high-homologies
are spread throughout the plasmid. Colour coding: Pseudomonas spp., Red:
90
Other -Proteobacteria excluding Pseudomonas spp., Blue: Others excluding all Proteobacteria, Green:
Poor homologies from a variety of taxonomies, Black.
Closest homologues for each CDS are listed in Supplementary Table 1.
95
ISME-J 0070OAR
Page 4
100
Supplementary Tables
Supplementary Table 1.
Annotation of functional CDS in the pQBR103 plasmid genome.
Closest homology to
Pseudomonas spp.†
105
––––––––––––––––––––––––––
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––
CDS Annotation*
Species (Plasmid)
E-value
Taxa, Species (Plasmid)
E-value % ID
Accession
1
Plasmid partitioning protein, ParA
P. putida (pWW0)
1.0 e-23
Alphaproteobacteria;
Rhizobium meliloti (pMBA19a)
4.0 e-33
34.78
Q5BTN9_RHIME
2
Plasmid partitioning protein, ParB
P. aeruginosa
1.0 e-05
Gammaproteobacteria;
Xylella fastidiosa
2.0 e-13
26.81
Q9PH83_XYLFA
28
Type II/III secretion system pilus
protein, PilN-like homologue
P. syringae B728a
1.0 e-15
Deltaproteobacteria;
Desulfotalea psychrophila
3.0 e-21
24.31
Q6AS33_DESPS
32
Pilus assembly-related protein,
PilB/TapB-like homologue
P. aeruginosa
1.0 e-57
Firmicutes;
6.0 e-66
Thermoanaerobacter tengcongensis
40.25
Q8RAG1_THETN
33
Type IV pilus biogenesis-like protein
No significant hit
Low homology
6.60
31.25
Q1QKR2_NITHA
34
Type II/IV transmembrane
secretion-related protein
No significant hit
Chlamydiae;
Parachlamydia sp. UWE25
5.0 e-05
22.49
Q6M9X9_PARUW
35
Transmembrane pilus-related
protein, PilA/ComP-like homologue
No significant hit
Low homology
2.20
28.28
Q3SEB6_PARTE
36
General secretion pathway
protein
P. fluorescens SBW25 5.0 e-05
Betaproteobacteria;
Burkholderia ambifaria AMMD
3.0 e-7
38.96
Q3FDY7_9BURK
37
Arylsulfatase-activating
protein-like homologue
No significant hit
Euryarchaeota;
Methanosaeta thermophila PT
3.0 e-20
24.93
Q2CLU4_9EURY
38
Arylsulfatase-activating
protein-like homologue
P. fluorescens PfO-1
Euryarchaeota;
Methanosaeta thermophila PT
3.0 e-16
26.29
Q2CLU4_9EURY
51
Arylsulfatase-activating
protein-like homologue
No significant hit
Gammaproteobacteria;
Vibrio parahaemolyticus
7.0 e-6
24.52
Q87IA8_VIBPA
52
Twitching motility protein, PilT-like
homolgue
P. putida KT2440
4.0 e-19
Deinococcus-Thermus;
Thermus thermophilus HB8
2.0 e-22
34.82
Q5SHF6_THET8
53
GGDEF-family domain protein
P. putida F1
5.0 e-17
Alphaproteobacteria;
1.0 e-22
Agrobacterium tumefaciens C58
29.96
Q8UB10_AGRT5
55
Plant-inducible DNA helicase, HelA
(EC 3.6.1.-)
P. fluorescens Pf-5
2.0 e-13
Deltaproteobacteria;
2.0 e-145 33.19
Pelobacter carbinolicus DSM 2380
Q3A3V6_PELCD
56
Catabolite gene activator family
protein, Crp/Vfr-like homologue
No significant hit
Gammaproteobacteria;
Photorhabdus luminescens
2.0 e-4
23.76
Q7MB98_PHOLL
57
Ribosomal protein,
RpsJ/NusE/S10-like homologue
No significant hit
Actinobacteria;
6.0 e-5
Rubrobacter xylanophilus DSM 9941
32.65
Q3X1J8_9ACTN
65
Deoxyribonuclease
P. putida KT2440
41.45
Q3JF75_NITOC
110
115
Closest homology to
all taxa‡
120
125
130
135
9.0 e-4
140
145
150
155
ISME-J 0070OAR
1.0 e-12
Gammaproteobacteria;
2.0 e-44
Page 5
Nitrosococcus oceani ATCC 19707
165
5.0 e-26
Gammaproteobacteria;
1.0 e-26
Marine proteobacterium HTCC2207
73
Catabolite gene activator family
protein, Crp/Vfr-like homologue
P. putida KT2440
74
Plant-inducible DNA helicase, HelC
Top hit
Gammaproteobacteria;
P. syringae 1448A
5.0 e-117 33.92
Q48IC3_PSE14
76
Restriction-modification methylase
No significant hit
Betaproteobacteria;
Ralstonia eutropha (pHG1)
5.0 e-104 48.27
Q7WX17_RALEU
80
Transmembrane rhomboid
family protein
Top hit
Gammaproteobacteria;
P. syringae 1448A
2.0 e-68
65.93
Q48FU5_PSE14
84
Glutathionylspermidine synthase
(EC 6.3.1.9)
Top hit
Gammaproteobacteria;
P. syringae 1448A
2.0 e-178 74.29
Q48FU9_PSE14
97
DNA-binding domain protein
P. putida (pWW0)
4.0 e-10
Gammaproteobacteria;
2.0 e-19
Shewanella denitrificans OS217
28.17
Q3NZV3_9GAMM
103 Response regulator domain protein
P. fluorescens Pf-5
5.0 e-08
Bacteroidetes;
Salinibacter ruber DSM 13855
4.0 e-11
38.39
Q2RZC9_SALRD
104 Type IV leader peptide processing
enzyme (EC 2.1.1.-, 3.4.23.43)
Top hit
Gammaproteobacteria;
P. aeruginosa
6.0 e-75
48.96
LEP4_PSEAE
105 Site-specific recombinase,
Integrase family, Int
Top hit
Gammaproteobacteria;
P. syringae DC300
1.0 e-85
42.82
Q881N3_PSESM
110 Site-specific recombinase,
Integrase family, Int
Pseudomonas sp.
ND6 (pND6-1)
6.0 e-16
Gammaproteobacteria;
3.0 e-66 45.19
Xanthomonas campestris 85-10 (pXCV183)
Q3C033_XANCS
113 Plasmid partitioning protein, ParB
P. putida F1
6.0 e-10
Actinobacteria;
7.0 e-11
Symbiobacterium thermophilum
43.06
Q67J37_SYMTH
119 Transmembrane thiol:disulfide
interchange protein, DsbD-like
(EC 1.8.1.8)
P. syringae DC300
2.0 e-75
Gammaproteobacteria;
Azotobacter vinelandii AvOP
1.0 e-81
36.93
Q4IWJ7_AZOVI
123 Transmembrane protein
Top hit
Gammaproteobacteria;
P. resinovorans (pCAR1)
8.0 e-29
40.23
Q8GHV0_PSERE
124 Zn-dependent protease with
chaperone function
Top hit
Gammaproteobacteria;
P. aeruginosa
4.0 e-63
54.65
Q9HVF9_PSEAE
126 Transmembrane protein, TolA-like
homolgue
Top hit
Gammaproteobacteria;
P. aeruginosa
1.0 e-13
29.65
TOLA_PSEAE
128 DNA-binding domain protein
P. syringae B728a
Betaproteobacteria;
Polaromonas sp. JS666
2.0 e-36
46.53
Q4ASP6_9BURK
131 Transmembrane thiol:disulfide
interchange protein, DsbD-like
Top hit
Gammaproteobacteria;
P. syringae DC300
6.0 e-8
27.61
Q87VS7_PSESM
134 Transmembrane autotransporter
Top hit
Gammaproteobacteria;
P. syringae B728a
6.0 e-129 40.65
Q4ZYT2_PSEU2
149 NAD-dependent deacetylase
(EC 3.5.1.-)
Top hit
Gammaproteobacteria;
P. fluorescens Pf-5
1.0 e-71
57.02
Q4KDX3_PSEF5
151 Nucleoid-associated protein,
NdpA-like homologue
Top hit
Gammaproteobacteria;
P. fluorescens Pf-5
1.0 e-154 81.08
Q4KHU2_PSEF5
155 Pilin-related protein, PilV-like
homologue
P. aeruginosa
(pKLC102)
Gammaproteobacteria;
Serratia entomophila
1.0 e-6
Q7BQX0_9ENTR
160
31.25
Q1YVH0_9GAMM
170
175
180
185
190
195
200
205
2.0 e-4
210
215
220
ISME-J 0070OAR
5.0 e-05
33.33
Page 6
157 UV resistance protein, RulA
Top hit
Gammaproteobacteria;
P. putida (pNAH7)
7.0 e-34
158 UV resistance protein, RulB
Top hit
Gammaproteobacteria;
P. putida (pNAH7)
1.0 e-149 59.86
Q1XGP6_PSEPU
160 Conjugal transfer protein,
TrbN-like homologue
No significant hit
Low homology
3.30 e-1
33.33
Q47073_ECOLI
171 Ribonuclease HII
(EC 3.1.26.4)
Top hit
Gammaproteobacteria;
P. fluorescens PfO-1
2.0 e-75
72.87
RNH2_PSEPF
175 Transcriptional regulator DnaK
suppressor protein, TraR/DksA-like
homologue
Top hit
Gammaproteobacteria;
P. syringae DC300
2.0 e-19
44.09
Q87ZE2_PSESM
178 DNA-binding protein, Hu
P. aeruginosa
Gammaproteobacteria;
Methylococcus capsulatus
2.0 e-25
61.8
Q60BE5_METCA
181 Exodeoxyribonuclease,
RecD/TraA-like homologue
No significant hit
Alphaproteobacteria;
Acidiphilium cryptum JF-5
1.0 e-18
20.94
Q2DE92_ACICY
182 Conjugal transfer TraG-family
coupling protein
P. syringae DC300
(pDC300A)
Betaproteobacteria;
Burkholderia vietnamiensis G4
3.0 e-33
25.68
Q4BGZ3_BURVI
188 Conjugal transfer TraB-family
topoisomerase
No significant hit
Low homology
4.30 e-2
21.84
Q54WI5_DICDI
189 Conjugal transfer assembly protein,
TraV-like homologue
No significant hit
Gammaproteobacteria;
6.0 e-5
Legionella pneumophila Philadelphia 1
33.64
Q5ZTS3_LEGPH
209 DNA primase, DnaG-like homologue
No significant hit
Spirochaetes;
Borrelia burgdorferi
8.0 e-8
23.43
PRIM_BORBU
213 GGDEF two-component response
regulator
Top hit
Gammaproteobacteria;
P. aeruginosa
3.0 e-58
38.06
Q9HUW7_PSEAE
249 Restriction enzyme-related protein
No significant hit
Deltaproteobacteria;
2.0 e-13
Anaeromyxobacter dehalogenans 2CP-C
51.32
Q2IGC5_ANADE
255 Bacteriophage-related protein
of unknown function
P. fluorescens SBW25 1.0 e-20
dsDNA virus;
Pseudomonas phage F116
2.0 e-29
37.87
Q5QF30_9CAUD
289 Plasmid conjugal transfer inhibition
protein, Tir-like
No significant hit
Gammaproteobacteria;
Erwinia amylovora (pEL60)
1.0 e-12
32.81
Q6TFZ5_ERWAM
301 Ankyrin repeat-containing protein
No significant hit
Eukaryota;
Mus musculus
2.0 e-12
34.52
Q8C8R3_MOUSE
Gammaproteobacteria;
P. syringae 1448A
1.0 e-51
45.33
Q48GQ0_PSE14
Gammaproteobacteria;
Halorhodospira halophila SL1
5.0 e-21
32.64
Q2CS52_ECTHA
318 Plant-inducible oligoribonuclease, Orn No significant hit
(EC 3.1.-.-)
Eukaryota;
2.0 e-29
Tetrahymena thermophila SB210
42.94
Q22ZB0_TETTH
327 Bacteriophage-related protein
of unknown function
No significant hit
Low homology
8.80 e-2
32.26
Q853S7_9CAUD
344 Transcriptional regulator, AlgZ-like
homologue
P. fluorescens Pf-5
Gammaproteobacteria;
Azotobacter vinelandii AvOP
5.0 e-19
51.55
Q4J155_AZOVI
53.9
Q1XGP5_PSEPU
225
230
235
2.0 e-25
240
245
7.0 e-12
250
255
260
265
270
307 Acetyltransferase GNAT family protein Top hit
275
311 Stringent starvation protein,
SspA-like homologue
P. syringae 1448A
1.0 e-42
280
285
ISME-J 0070OAR
1.0 e-18
Page 7
290
350 RNA polymerase sigma-32 factor,
RpoH-like homologue
Top hit
Gammaproteobacteria;
P. aeruginosa
2.0 e-84
62.45
RP32_PSEAE
361 Plant-inducible DNA helicase, HelB
Top hit
Gammaproteobacteria;
P. resinovorans (pCAR1)
0.0
54.86
Q8GHN5_PSERE
364 Cold shock DNA-binding domain
protein, Csp-like homologue
Top hit
Gammaproteobacteria;
P. putida KT2440
7.0 e-41
39.18
Q88Q61_PSEPK
367 DnaJ family protein
P. fluorescens SBW25 8.0 e-08
Gammaproteobacteria;
5.0 e-15
Nitrosococcus oceani ATCC 19707
54.29
Q3JC07_NITOC
371 DNA helicase
P. aeruginosa
Betaproteobacteria;
Burkholderia vietnamiensis G4
2.0 e-56
32.85
Q4BMY9_BURVI
375 Bacteriophage-related protein
of unknown function
No significant hit
dsDNA viruses;
Vibrio phage K139
2.0 e-12
36.48
Q8W758_9CAUD
377 Transcriptional regulator, AlgZ-like
homologue
Top hit
Gammaproteobacteria;
P. aeruginosa
5.0 e-04
46
Q9RPY7_PSEAE
383 Plasmid IncA/C–Inc P3 replication
protein, RepA
No significant hit
Gammaproteobacteria;
Escherichia coli (pRA1)
1.0 e-61
44.56
Q08896_ECOLI
401 DNA helicase (EC 3.6.1.-)
Top hit
Gammaproteobacteria;
P. syringae B728a
7.0 e-130 48.28
403 DNA restriction methylase
Top hit
Gammaproteobacteria;
1.0 e-72
Pseudomonas sp. ND6 (pND6-1)
35.87
Q6XUK5_9PSED
407 DNA ligase, bacteriophage-like
homologue
No significant hit
dsDNA viruses;
Bacteriophage KVP40
6.0 e-28
27.52
Q6WI94_BPKV4
413 Single-strand binding protein,
Ssb-like homologue
Top hit
Gammaproteobacteria;
P. syringae DC300
3.0 e-29
38.74
SSB_PSESM
421 Response regulator receiver
domain protein
Top hit
Gammaproteobacteria;
P. aeruginosa
1.0 e-34
53.23
PILG_PSEAE
426 Tn5042-like transposase, TnpA
Top hit
2.0 e-61
99.15
Q70MR7_PSEFL
427 Tn5042-like transposase, TnpB
Top hit
Gammaproteobacteria;
P. fluorescens
Gammaproteobacteria;
P. fluorescens
1.0 e-61
100
Q70MR8_PSEFL
428 Tn5042-like transposase, TnpC
Top hit
Gammaproteobacteria;
P. fluorescens
0.0
99.4
Q70MR9_PSEFL
430 Tn5042-like organomercurial lyase,
MerB (EC 4.99.1.2)
Top hit
Gammaproteobacteria;
P. fluorescens
2.0 e-115 99.06
Q70MS1_PSEFL
431 Tn5042-like mercuric ion reductase,
MerA (EC 1.16.1.1)
Top hit
Gammaproteobacteria:
P. fluorescens
0.0
96.25
Q70MS2_PSEFL
432 Tn5042-like inner membrane mercury
ion uptake protein, MerC
Top hit
Gammaproteobacteria;
P. fluorescens
1.0 e-55
93.75
Q53IQ9_PSEFL
433 Tn5042-like periplasmic mercuric ion
binding protein, MerP
Top hit
Gammaproteobacteria;
P. fluorescens
9.0 e-44
100
Q53IR0_PSEFL
434 Tn5042-like mercuric ion transport
protein, MerT
Top hit
Gammaproteobacteria;
P. fluorescens
3.0 e-43
99.14
Q53IQ8_PSEFL
435 Tn5042-like Mer operon
activator/repressor, MerR
Top hit
Gammaproteobacteria;
P. fluorescens
1.0 e-76
99.3
Q70MS3_PSEFL
295
7.0 e-11
300
305
5.0 e-04
310
315
320
Q4ZWH0_PSEU2
325
330
335
340
345
350
ISME-J 0070OAR
Page 8
355
360
365
438 Recombination-associated protein,
RdgC-like homologue
Top hit
Gammaproteobacteria;
P. fluorescens Pf-5
4.0 e-93
61.76
Q4K8D9_PSEF5
443 Carbon storage translational
RsmA/CsrA family regulator,
RsmA-like homologue
Top hit
Gammaproteobacteria;
P. fluorescens Pf-5
4.0 e-14
73.08
Q4KEY0_PSEF5
445 Plasmid IncA/C–Inc P3 replication
protein, RepA
No significant hit
Gammaproteobacteria;
Buchnera aphidicola (pBPs2)
1.0 e-33
29.61
Q9ZER8_9ENTR
455 Plasmid partitioning protein, ParB
P. syringae 1448A
1.0 e-18
Betaproteobacteria:
Ralstonia metallidurans CH34
1.0 e-61
30.88
Q5NUW9_RALME
461 Exodeoxyribonuclease I
P. putida KT2440
1.0 e-10
Gammaproteobacteria;
Methylococcus capsulatus
4.0 e-15
24.77
Q605A2_METCA
464 DNA polymerase III subunit
(EC 2.7.7.7)
P. syringae 1448A
4.0 e-31
Gammaproteobacteria;
Vibrio fischeri ATCC 70601
1.0 e-31
28.53
Q5E8Z1_VIBF1
465 RNA polymerase sigma factor,
RpoD-like homologue
P. putida F1
5.0 e-17
Betaproteobacteria;
Ralstonia solanacearum
2.0 e-20
23.95
Q8XXA1_RLSO
470 Sensory transduction histidine kinase
P. aeruginosa
3.0 e-13
Gammaproteobacteria;
2.0 e-19
Nitrosococcus oceani ATCC 19707
31.42
Q3JET8_NITOC
471 Two-component regulator sensor
histidine kinase fused response
regulator protein, PilL-like homologue
(EC 2.7.3.-)
P. aeruginosa
9.0 e-73
Gammaproteobacteria;
5.0 e-78
Alkalilimnicola ehrlichei MLHE-1
35.27
Q34VJ7_9GAMM
472 Chemotaxis signalling protein,
PilK-like homologue
Top hit
2.0 e-24
31.58
Q51346_PSEAE
473 Methyl-accepting chemotaxis
P. fluorescens SBW25 2.0 e-22
transducer protein, PilJ-like homologue
Gammaproteobacteria;
8.0 e-25
Alkalilimnicola ehrlichei MLHE-1
24.06
Q34VJ8_9GAMM
474 Pilus-related protein, PilI-like
homologue
No significant hit
Gammaproteobacteria;
Xylella fastidiosa
2.0 e-6
27.66
Q9PC31_XYLFA
475 Two-component response regulator
transcriptional regulatory protein,
PilG-like homologue
P. aeruginosa
3.0 e-17
Gammaproteobacteria;
Psychrobacter arcticum
1.0 e-19
42.98
Q4FQP2_PSYAR
477 Plasmid partitioning protein, ParB
P. putida F1
9.0 e-29
Gammaproteobacteria;
Legionella pneumophila Lens
2.0 e-36
35.23
Q5WTL1_LEGPL
370
375
380
385
Gammaproteobacteria;
P. aeruginosa
390
395
*
Annotation based on inspection of homology data over the entire length of the CDS using predicted protein sequences. Assignments made only if homologous sequences
had functional annotations, except a limited number of CDSs annotated on the basis of domain-only homology or as bacteriophage-related proteins of unknown function.
Enzyme Commission (EC) numbers were assigned by the GNARE system (Sulakhe et al, 2005).
400
†
The closest BLAST homology to a Pseudomonas spp. chromosomal sequence or a Pseudomonas plasmid sequence unless the homology was the highest observed (Top
hit), in which case the data are provided in the adjacent ‘Closest homology to all taxa’ columns. ‘No significant hit’ indicates no significant homology match to any
Pseudomonas sequence.
‡
405
The closest BLAST homology observed from any taxa including Pseudomonas. ‘Low homology’ indicates very weak levels of homology over a reasonable length of the
CDSs or supporting genomic context. % ID is with no gaps.
ISME-J 0070OAR
Page 9
Supplementary Table 2.
Clustering of the predicted proteins in pQBR103 into gene families.
410
Cluster Comment
415
420
425
430
CDS
Mixed
1
ParB partition
113, 455, 477
2
Pilin-associated
36, 155
3
Arylsulfatase regulator-like
37, 38, 39, 42, 47, 50, 51*
4
Twitching motility protein
32, 52
5
DNA-binding domain protein
97, 221, 223, 337
6
Response regulator
103, 213, 421, 471, 475
7
Transmembrane thiol:disulfide interchange protein
119, 131
8
Transmembrane
126, 129
C,F
9
Transcriptional regulator
344, 347
F,O
10
RNA polymerase sigma factor
350, 465
11
Plasmid replication protein
383, 445
12
Conserved hypothetical
43, 44, 46*
13
Conserved hypothetical
391, 416
14
Conserved hypothetical/Orphan
240, 375
C,O
15
Conserved hypothetical/Orphan
327, 328
C,O
16
Orphan
125, 156
17
Orphan
216, 244, 247, 265
18
Orphan
217, 218
19
Orphan
233, 238
F,F,C,C,C,C,F
F,C,C,C
CDS regions were placed into protein families using the mcl clustering method at different stringencies (Enright et
al, 2002). Zero clusters formed at an inflation level of 3.0 or above. Six and nineteen clusters formed respectively
435
at the less conservative thresholds of 2.0 (underlined) and 1.2. Possible functional characteristics for each gene
family are based on the annotation of functional CDSs within each cluster. If a cluster contained a combination of
CDSs it is labelled ‘mixed’ (F, functional, CH, conserved hypothetical, O, orphan). Contiguous CDSs are marked
in bold. * Clusters 3 and 13 are transitively linked at very low homology.
440
ISME-J 0070OAR
Page 10
Supplementary Table 3.
Potential VirB/D4 T4SS-like transfer elements
Closest VirB/D4 homologue
445
CDS Annotation
160
450
175
181
Conjugal transfer TrbN-like homologue
Transcriptional regulator, TraR/DksA-like homologue
Exodeoxyribonuclease/helicase,
Vir*
VirB1
-
%ID† E-value
25/43
6 x E-3
44/62
xE-21
2
VirB6
21/37
5x
Accession
YP_190354.1
AAO56962.1
E-16
ZP_01144634.1
RecD/TraA-like homologue
455
182
Conjugal transfer TraG-family coupling protein
VirD4
25/43
2 x E-33
ZP_00242688.1
188
Conjugal transfer TraB-family topoisomerase
VirB10
26/40
1 x E-05
AAW83066.1
33/43
9x
E-05
AAU28154.1
4x
E-32
AAW83069.1
189
191
Conjugal transfer TraV-like homologue
Conjugal transfer TraC-like homologue
VirB7
VirB4
27/45
The Agrobacterium tumefaciens pTi (pTiC58) VirB/D4 system is used as the paradigm of Mating pair formation
(Mpf) and related type IV secretion systems (Schroder & Lanka, 2005). It contains 12 genes, VirB1-11, D4,
460
which are conserved in a large number of secretion systems transporting DNA. These systems have 9-14
genes, and the order of these differ between transfer systems. Additional Mpf/type IV genes are Tra (IncFI F)
and Trb (IncP RP4).
*
The most likely VirB/D4 homologue is indicated for each CDS. No significant homologues to VirB3, B5, B6,
B8, B9 or B11 in pQBR103 were identified. The CDS in between the CDS indicated had no homology to
proteins involved in Mfp/Type IV secretion systems.
465
†
% identity and similarity are given.
ISME-J 0070OAR
Page 11
470
Supplementary Table 4.
Plant-inducible IVET fusions and potential transcriptional regions.
IVET
Position &
Transcription
Number of CDS in Cluster
likely to be expressed by transcription
reported by IVET fusion (CDS) (F/CH/O)
1
CP-dS3
16,565 (-)
0†
2
G4-8S2
57,469 (-)
0†
2
G3-1S3
58,992 (+)
2 (054-055) (1/1/0) HelA is CDS 055.
2
p8-c2 (ppi4)
62,487 (+)
2 (054-055) (1/1/0)
3
BB-8S1
72,580 (-)
0†
4
R6 (ppi17)
80,188 (+)
1 (074) (1/0/0) HelC is CDS 074.
5
BB-5S3
161,288 (+)
0†
6
G3-4S5
171,715 (+)
1 (163) (0/0/1) [2 (163-164) (0/0/2)]
7
BB-2S2
180,322 (-)
2 (172-171) (1/0/1)
8
G4-7R1
225,860 (-)
0†
9
G4-7S2
242,651 (-)
0†
9
L2-R3
245,345 (+)
7 (243-249) (1/0/6) [11 (239-249) (1/1/9)]
9
G5-6R4
246,481 (-)
0†
9
G4-10S1
247,293 (-)
0†
10
R4'
257,955 (+)
1 (265) (0/0/1)
10
CP-dR1
259,778 (+)
3 (268-270) (0/0/3)
11
G2-3R1
269,276 (-)
1 (283) (0/0/1)
11
R5
271,081 (-)
0†
12
G4-2R2
278,449 (+)
0†
13
G4-6S1 (pIVETD5)
296,034 (+)
9 (315-323) (1/1/7) Orn is CDS 318.
14
G3-2R6
305,792 (-)
0†
14
G2-5S1
309,926 (-)
0†
14
CP5-1S
312,662 (-)
0†
14
G3-4R3
313,690 (+)
6 (344-349) (1/1/4) [32 (332-363) (3/4/25)]
15
CP-eR1
322,194 (+)
13 (351-363) (1/1/11) [32 (332-363) (3/4/25)] HelB is CDS
Region
475
480
485
490
495
500
361.
505
510
16
CP4-1R
374,762 (+)
9 (421-429) (4/1/4) Adjacent to Tn5042. TnpA is CDS 426.
17
R3'
393,119 (-)
0†
17
G1-4R1
394,575 (-)
0†
Estimate of CDS potentially expressed in planta (≤ 200 bp separation) :
65 (14%)
11/7/47 (12/7/17%)
Estimate of CDS potentially expressed in planta (≤ 500 bp separation) :
83 (17%)
12/10/61 (13/10/22%)
Number of IVET fusions apparently not reporting CDS transcription :
15 (52%)
Number of regions apparently not reporting CDS transcription :
6 (35%)
Number of regions in which transcription is reported in both orientations :
4 (24%)
All Dap-end IVET sequences were mapped onto the pQBR103 genome sequence. IVET names and synonyms
are given. Of the 37 reported in Zhang et al (2004a), the above were found to be unique after Dap-end sequences
515
were compared to the completed pQBR103 genome. The position of each IVET insertion point is given, plus the
direction of transcription (+/- strand) reported by the IVET fusion. CDS clusters are defined in the Supporting text
ISME-J 0070OAR
Page 12
and are those likely to be transcribed as indicated by each IVET fusion (calculated assuming ≤ 200 bp between
IVET insertion point and closest CDS, and with ≤ 200 bp between adjacent CDS in the same orientation [or
assuming ≤ 500 bp]). The number of CDSs in each cluster is given followed by the CDSs and F/CH/O numbers in
520
parentheses (F, functional, CH, conserved hypothetical, O, orphan). † indicates that no CDSs are likely to be
expressed in that orientation, and that a cluster of ≥ 1 CDS exists in the opposite orientation. The 11 F CDS
identified with ≤ 200bp separation are: CDS 055, 074, 171, 249, 318, 344, 361, 421, 426, 427 and 428; the
additional CDS (≤ 500bp) is CDS 350. IVET insertion positions have been grouped into regions where adjacent
IVETs are separated by ≤ 5 kb. The transcription reported by each IVET may occur in a position containing CDSs
525
or in a position without CDSs (> 200 or > 500 bp of a CDS). Further, IVET transcription may support the direction
of transcription inferred by the orientation of CDSs or it may not. Finally, transcription reported by adjacent IVETs
may be in agreement or suggest convergent/divergent/overlapping transcription.
530
ISME-J 0070OAR
Page 13
Supplementary Table 5.
PCR surveys were used to demonstrate the presence or absence of
pQBR103 regions in other pQBR plasmids.
535
540
545
550
555
560
565
570
pQBR Plasmid
Group
Estimated Size (kb)
4
I
321
41
I
301
42
I
367
44
I
130
47
I
256
29
III
174
55
III
149
57
IV
261
CDS
Annotation
027
Orphan
+
+
+
-
+
-
-
-
028
Type II/III secretion system pilus protein
+
+
+
-
+
-
-
-
029
Orphan
+
+
+
-
+
-
-
-
030
Orphan
+
+
+
-
+
-
-
-
031
Orphan
+
+
+
-
+
-
-
-
032
Pilus assembly-related protein
+
+
+
-
+
-
-
-
033
Pilus biogenesis-like protein
+
+
+
-
+
-
-
-
034
Transmembrane secretion-related protein
+
+
+
-
+
-
-
-
035
Transmembrane pilus-related protein
+
+
+
-
+
-
-
-
036
Transmembrane secretion-related protein
+
+
+
-
+
-
-
-
IV0023 Between CDS053-54, overlapping 54
+
+
+
-
+
-
-
055
Plant-inducible DNA helicase, HelC
+
+
+
-
+
-
-
062
Conserved hypothetical
+
+
+
-
+
-
-
144
Orphan
+
+
+
-
-
-
-
151
Nucleoid-associated protein
+
+
+
-
-
-
-
156
Orphan
+
+
+
-
-
-
-
NI0730 Between CDS215-216, overlapping 216
+
+
+
-
-
-
-
249
+
+
+
-
-
-
-
IV0036 Between CDS282-283, overlapping 283
+
+
+
-
-
-
-
286
Conserved hypothetical
+
+
+
-
-
-
-
297
Orphan
+
+
+
-
-
-
-
310
Orphan
+
+
+
-
-
-
-
318
Plant-inducible oligoribonuclease, Orn
+
+
+
-
+
-
-
361
Plant-inducible DNA helicase, HelB
+
+
+
+
+
-
-
371
Helicase
+
+
+
+
+
-
-
391
Conserved hypothetical
+
+
+
+
+
-
-
431
Tn5042-like Mercuric ion reductase, MerA
+
+
+
+
+
+
+
+
435
Tn5042-like Mer activator/repressor, MerR
+
Restriction enzyme-related protein
+
+
+
+
+
+
+
IV0034 Between CDS453-454, overlapping 454
+
+
+
+
+
-
-
470
Transcriptional regulator methylesterase
+
+
+
+
+
-
-
-
471
Two-component regulator sensor histidine
+
+
+
+
+
-
-
-
kinase fused response regulator protein
575
472
Chemotaxis signalling protein, CheR
+
+
+
+
+
-
-
-
473
Methyl-accepting chemotaxis transducer protein +
+
+
+
+
-
-
-
474
Pilus-related protein
+
+
+
+
+
-
-
-
475
Two-component response regulator
+
+
+
+
+
-
-
-
transcriptional regulatory protein
ISME-J 0070OAR
Page 14
Note that pQBR57 was only surveyed for these regions plus CDS 431 and CDS 435. IV/NI regions are intergenic
regions rather than specific CDS. All primer pairs amplified pQBR103 DNA but P. fluorescens SBW25 or P.
580
putida UWC1 DNA.
PCR survey results: +, Successful amplification with appropriately-sized fragment; -,
amplification not detected.
ISME-J 0070OAR
Page 15
Download