Table S4: Instances of Glucose

advertisement
Table S4 --- Version 1: Locations of glucose motif #2.
Genes Diverging
from Intergenic
Region
CC0084
CC0138 /CC0139 e
CC0214 c, e
CC0669 c
CC0820 f
CC1493 c
CC1517 c, e/CC1518 c
CC2193 a, c
CC2226 a, c
CC2367 d
CC2928 c
CC3059 c
CC3062 c
CC3163 a, c
Multilevel
Concensus
a
Motif Sequence b
GAAACTGTCATCCCGGCCAA
GCAATTGCAAGTCAGTCGCA
GAGACTGTTATGAAATAACG
AAAAATTTCATGCGTGCGCC
AAAACTGTCATCCCGGCCCA
AAAAATACGATGGTAGCGCA
GCAATTGTTATGATATAACA
CTAATTAAGATTGATTCTCA
GAAACTGTCATGCCAGACCG
CTAATTGCGAGTCAATCGCA
AATATTGCGAGTGCGTCGCA
GAAATTGCAATTCGTTCTCA
AAAACTGTCATCCCGGCCCA
AAAAATATGAGCGAAGAAAC
AAAAATGTGATGCAATCACA
G
T ACC GTGCTGAC C
C
A
G
Bases
from
Start
Codon
155
142/99
62
32
118
170
93 /74
71
152
127
31
65
102
46
Two genes diverge from the intergenic region; only the gene shown was up in M2G, but did not
necessarily meet the significance criteria.
b
The 475 bases upstream of each gene and the first 125 bases of each gene (to provide robustness against
mis-placed start codons) were searched. Four instances of the motif were omitted because the microarray
data did not support their inclusion and motif was either more than 300 base pairs from the start codon or
overlapped a coding region.
c
Gene was up in M2G and met significance criteria.
d
Gene was up in M2G but did not meet significance criteria.
e
Gene is significantly up in M2.
f
Gene is significantly up in M2X.
Table S4 – Version 2: Locations of glucose motif #2. Genes predicted to be in transcription units with
genes downstream of the motif are shown indented.
Putative Transcription Units Diverging from Intergenic Region
with Motiff
CC0084 lysyl-tRNA synthetase, lysS
CC0138 sensor histidine kinase/response regulator
CC0137 ampG protein, putative b /
CC0139 TonB-dependent receptor
CC0214 TonB-dependent receptor c
CC0669 hypothetical protein c
CC0820 Smp-30/Cgr1 family protein
CC0819 dehydratase, IlvD/Edd family
CC1493 phosphoenolpyruvate carboxylase, ppc c
CC1517 TonB-dependent receptor c/
CC1518 ABC transporter, ATP-binding protein c
CC1519 hypothetical protein
CC1520 hypothetical protein c
CC2193 hypothetical protein c
CC2226 EF hand domain protein c
CC2367 hypothetical protein d
CC2928 TonB-dependent receptor c
CC2927 Uncharacterized iron-regulated membrane protein a, d
CC2926 hypothetical protein d
CC3059 conserved hypothetical protein c
CC3060 conserved hypothetical protein c
CC3062 thiamine biosynthesis protein apbE, putative c
CC3063 sulfite reductase (NADPH) flavoprotein, cysJ
CC3163 hypothetical protein c
Motif Sequencee
GAAACTGTCATCCCGGCCAA
Bases from
Start Codon
of First
Gene
155
GCAATTGCAAGTCAGTCGCA
GAGACTGTTATGAAATAACG
AAAAATTTCATGCGTGCGCC
142/99
62
32
AAAACTGTCATCCCGGCCCA
AAAAATACGATGGTAGCGCA
118
170
GCAATTGTTATGATATAACA
CTAATTAAGATTGATTCTCA
GAAACTGTCATGCCAGACCG
CTAATTGCGAGTCAATCGCA
93 /74
71
152
127
AATATTGCGAGTGCGTCGCA
31
GAAATTGCAATTCGTTCTCA
65
AAAACTGTCATCCCGGCCCA
AAAAATATGAGCGAAGAAAC
102
46
a
Annotation from COG annotations (2, 3). All other annotations came from GenBank (1). Gene names are
from TIGR (http://www.tigr.org).
b
No microarray data was available for the gene.
c
The gene was up in M2G and met significance criteria.
d
The gene was up in M2G but did not meet significance criteria.
e
The 475 bases upstream of each gene and the first 125 bases of each gene (to provide robustness against
mis-placed start codons) were searched. Four instances of the motif were omitted because the microarray
data did not support their inclusion and motif was either more than 300 bp from the start codon or
overlapped a coding region.
f
When two genes diverge from an intergenic region and the microarray data suggested that only one was
up in M2G, only that gene is shown.
References
1.
Nierman, W. C., T. V. Feldblyum, M. T. Laub, I. T. Paulsen, K. E. Nelson, J.
A. Eisen, J. F. Heidelberg, M. R. Alley, N. Ohta, J. R. Maddock, I. Potocka,
W. C. Nelson, A. Newton, C. Stephens, N. D. Phadke, B. Ely, R. T. DeBoy, R.
J. Dodson, A. S. Durkin, M. L. Gwinn, D. H. Haft, J. F. Kolonay, J. Smit, M.
B. Craven, H. Khouri, J. Shetty, K. Berry, T. Utterback, K. Tran, A. Wolf, J.
Vamathevan, M. Ermolaeva, O. White, S. L. Salzberg, J. C. Venter, L.
Shapiro, C. M. Fraser, and J. Eisen. 2001. Complete genome sequence of
Caulobacter crescentus. Proc. Natl. Acad. Sci. USA 98:4136-4141.
2.
3.
Tatusov, R. L., E. V. Koonin, and D. J. Lipman. 1997. A genomic perspective
on protein families. Science 278:631-637.
Tatusov, R. L., D. A. Natale, I. V. Garkavtsev, T. A. Tatusova, U. T.
Shankavaram, B. S. Rao, B. Kiryutin, M. Y. Galperin, N. D. Fedorova, and
E. V. Koonin. 2001. The COG database: new developments in phylogenetic
classification of proteins from complete genomes. Nucleic Acids Res. 29:22-28.
Download