Table S4 --- Version 1: Locations of glucose motif #2. Genes Diverging from Intergenic Region CC0084 CC0138 /CC0139 e CC0214 c, e CC0669 c CC0820 f CC1493 c CC1517 c, e/CC1518 c CC2193 a, c CC2226 a, c CC2367 d CC2928 c CC3059 c CC3062 c CC3163 a, c Multilevel Concensus a Motif Sequence b GAAACTGTCATCCCGGCCAA GCAATTGCAAGTCAGTCGCA GAGACTGTTATGAAATAACG AAAAATTTCATGCGTGCGCC AAAACTGTCATCCCGGCCCA AAAAATACGATGGTAGCGCA GCAATTGTTATGATATAACA CTAATTAAGATTGATTCTCA GAAACTGTCATGCCAGACCG CTAATTGCGAGTCAATCGCA AATATTGCGAGTGCGTCGCA GAAATTGCAATTCGTTCTCA AAAACTGTCATCCCGGCCCA AAAAATATGAGCGAAGAAAC AAAAATGTGATGCAATCACA G T ACC GTGCTGAC C C A G Bases from Start Codon 155 142/99 62 32 118 170 93 /74 71 152 127 31 65 102 46 Two genes diverge from the intergenic region; only the gene shown was up in M2G, but did not necessarily meet the significance criteria. b The 475 bases upstream of each gene and the first 125 bases of each gene (to provide robustness against mis-placed start codons) were searched. Four instances of the motif were omitted because the microarray data did not support their inclusion and motif was either more than 300 base pairs from the start codon or overlapped a coding region. c Gene was up in M2G and met significance criteria. d Gene was up in M2G but did not meet significance criteria. e Gene is significantly up in M2. f Gene is significantly up in M2X. Table S4 – Version 2: Locations of glucose motif #2. Genes predicted to be in transcription units with genes downstream of the motif are shown indented. Putative Transcription Units Diverging from Intergenic Region with Motiff CC0084 lysyl-tRNA synthetase, lysS CC0138 sensor histidine kinase/response regulator CC0137 ampG protein, putative b / CC0139 TonB-dependent receptor CC0214 TonB-dependent receptor c CC0669 hypothetical protein c CC0820 Smp-30/Cgr1 family protein CC0819 dehydratase, IlvD/Edd family CC1493 phosphoenolpyruvate carboxylase, ppc c CC1517 TonB-dependent receptor c/ CC1518 ABC transporter, ATP-binding protein c CC1519 hypothetical protein CC1520 hypothetical protein c CC2193 hypothetical protein c CC2226 EF hand domain protein c CC2367 hypothetical protein d CC2928 TonB-dependent receptor c CC2927 Uncharacterized iron-regulated membrane protein a, d CC2926 hypothetical protein d CC3059 conserved hypothetical protein c CC3060 conserved hypothetical protein c CC3062 thiamine biosynthesis protein apbE, putative c CC3063 sulfite reductase (NADPH) flavoprotein, cysJ CC3163 hypothetical protein c Motif Sequencee GAAACTGTCATCCCGGCCAA Bases from Start Codon of First Gene 155 GCAATTGCAAGTCAGTCGCA GAGACTGTTATGAAATAACG AAAAATTTCATGCGTGCGCC 142/99 62 32 AAAACTGTCATCCCGGCCCA AAAAATACGATGGTAGCGCA 118 170 GCAATTGTTATGATATAACA CTAATTAAGATTGATTCTCA GAAACTGTCATGCCAGACCG CTAATTGCGAGTCAATCGCA 93 /74 71 152 127 AATATTGCGAGTGCGTCGCA 31 GAAATTGCAATTCGTTCTCA 65 AAAACTGTCATCCCGGCCCA AAAAATATGAGCGAAGAAAC 102 46 a Annotation from COG annotations (2, 3). All other annotations came from GenBank (1). Gene names are from TIGR (http://www.tigr.org). b No microarray data was available for the gene. c The gene was up in M2G and met significance criteria. d The gene was up in M2G but did not meet significance criteria. e The 475 bases upstream of each gene and the first 125 bases of each gene (to provide robustness against mis-placed start codons) were searched. Four instances of the motif were omitted because the microarray data did not support their inclusion and motif was either more than 300 bp from the start codon or overlapped a coding region. f When two genes diverge from an intergenic region and the microarray data suggested that only one was up in M2G, only that gene is shown. References 1. Nierman, W. C., T. V. Feldblyum, M. T. Laub, I. T. Paulsen, K. E. Nelson, J. A. Eisen, J. F. Heidelberg, M. R. Alley, N. Ohta, J. R. Maddock, I. Potocka, W. C. Nelson, A. Newton, C. Stephens, N. D. Phadke, B. Ely, R. T. DeBoy, R. J. Dodson, A. S. Durkin, M. L. Gwinn, D. H. Haft, J. F. Kolonay, J. Smit, M. B. Craven, H. Khouri, J. Shetty, K. Berry, T. Utterback, K. Tran, A. Wolf, J. Vamathevan, M. Ermolaeva, O. White, S. L. Salzberg, J. C. Venter, L. Shapiro, C. M. Fraser, and J. Eisen. 2001. Complete genome sequence of Caulobacter crescentus. Proc. Natl. Acad. Sci. USA 98:4136-4141. 2. 3. Tatusov, R. L., E. V. Koonin, and D. J. Lipman. 1997. A genomic perspective on protein families. Science 278:631-637. Tatusov, R. L., D. A. Natale, I. V. Garkavtsev, T. A. Tatusova, U. T. Shankavaram, B. S. Rao, B. Kiryutin, M. Y. Galperin, N. D. Fedorova, and E. V. Koonin. 2001. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 29:22-28.