An Iterative Monte Carlo Simulation Algorithm Reveals Subtle Variations in Common Regulatory Motifs of Closely Related Transcription Factors Seitzer, Phillip M *† and Facciotti, Marc T.*† Department of Biomedical Engineering, †Genome Center, University of California at Davis, Davis, CA 95616-8079 Tel: 513-265-1568 Corr. E-mail: pmseitzer@ucdavis.edu * Summary: Application of a novel iterative motif-finding algorithm cast in a Monte Carlo simulation framework to experimentally determined transcription factor binding sites in Halobacterium Salinarum NRC-1 distinguishes small differences between binding sites of closely related transcription factors. The combination of high resolution ChIP-Seq data and this algorithm recover a binding motif discovered which is consistent with the protein-DNA interactions predicted by structural analysis of an archaeal general transcription factor IIb homolog. The motif sensitivity provided by the combination of this novel algorithm with accurate transcription factor localization data will drive further inquiry into underlying mechanisms of gene regulatory network evolution. In this study we examine the DNA binding sites of homologous transcription factors (TFs) through a detailed analysis of high-resolution genome-wide localization data for three archaeal homologs of the general transcription factor TFIIb. Chromatin immunoprecipitation followed by Next Generation Sequencing (Chip-Seq) experiments revealed the genomic binding locations of three closely related transcription factors (tfbB, tfbD, and tfbG). We further demonstrate that the application of an iterative Monte Carlo simulation algorithm helps distinguish between binding sites for each of these transcription factors and provides clear testable hypotheses regarding the mechanisms of promoter discrimination in this family of TFs. The conserved elements of the discovered motifs correspond precisely with crystallographic data describing the physical associations between archaeal TFIIb homologs and DNA[1]. The coordinated application of high-resolution experimental TF localization techniques and regulatory motif detection using the algorithmic approach described herein will help to uncover the structural and biophysical basis for promoter selection by closely related TFs. This is critical for our continued understanding of the processes underlying gene regulatory network (GRN) evolution. Regulatory motif analysis can be a useful tool in the process of identifying and segregating TF binding sites into TF-specific groups. However, in the case of homologous TFs the evolutionary relatedness of the TFs often means that promoters for different TFs often share structural features confounding the ability of traditional approaches for motif detection to uncover the subtle structural differences between promoter that confer distinct function to each TF. a) b) Fig. 1. a) tfbB, tfbD, and tfbG datasets, with (number of sites) and colored legend. b) Representative ‘best motif’ logos determined for 5 tfb datasets. Top left motif represents sites on the genome where only tfbB may bind, middle left tfbD only, bottom left tfbG only, top right tfbB or tfbD, bottom right tfbB, tfbD, or tfbG. Gray bar over tfbB only indicates tfb and tbp binding regions determined by protein residue-nucleotide interactions[1]. All five motifs demonstrate a highly-conserved 4-nucleotide string of As and Ts in the second tfb binding region and few distinguishing characteristics in the TBP binding region. Small differences in the first tfb binding region serve as distinguishing characteristics between different binding sites. For the transcription factors analyzed in this study, our algorithmic approach appears to find distinguishing features of similar regulatory motifs that can be interpreted in the context of structural data to assign motif variations to individual TFs (see Fig 1b). The data further suggest interesting experimentally-testable hypotheses regarding promoter selection by TFIIB homologs in archaea. [1] Littlefield O et al, “The structural basis for the oriented assembly of a TBP/TFB/promoter complex”, PNAS, vol:96,1998.