integrated approach for modeling physiological, biomechanical, and

ANALYSIS OF P53 BINDING SITES BY USING CHIP-SEQ DATA I.S. Yevshin1,2, Yu.V. Kondrakhin1,3, M. Turunen4, T. Kivioja4, F. Nikulenkov5, R.N. Sharipov,1,6, J. Taipale4, G. Selivanova5, F.A. Kolpakov1,3,* 1 Institute of Systems Biology, Ltd, Novosibirsk, Russia; 2Novosibirsk State University, Novosibirsk, Russia; 3Design Technological Institute of Digital Techniques SB RAS, Novosibirsk, Russia; 4National Public Health Institute, Helsinki, Finland; 5Karolinska Institutet, Stockholm, Sweden; 6Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia e-mail: fedor@biouml.org *Corresponding author Key words: p53, binding site recognition, ChIP-seq, position weight matrix Motivation and Aim: Transcription factor p53 is a well-known tumor suppressor. Mutations of p53 binding sites are basic hallmarks of many types of cancer. The canonical structure of p53 binding sites is described by two decameric sequences PuPuPuC(A/T)(T/A)GPyPyPy separated by a spacer of 0–13 bp. Known models for recognition of p53 sites were built earlier on the base of separate small training sets obtained by less comprehensive methods than ChIP-seq. Recently, five sets of ChIP-seq data were obtained in the frameworks of “Net2Drug” project that allowed to perform: 1) comparison of methods for identification of transcription factor (TF)-binding fragments; 2) construction of more accurate method for p53 binding site prediction. Methods and Algorithms: ChIP-seq data were obtained in the experiment of treatment of breast cancer MCF7 cell line by activators of p53 10 uM Nutlin3a, 0.1 and 1 uM RITA (Reactivation of p53 and Induction of Tumor cell Apoptosis) and 100 uM 5FluoroUracil. Three methods SISSRs (Site Identification from Short Sequence Reads), MACS (Model-based Analysis of ChIP-Seq) and KOLI were applied to identification of p53-binding fragments. For recognition of potential binding sites the extended position weight matrix (PWM) method [1] was used. Our new alignment method and clusterization method adapted to ChIP-seq data (unpublished) were applied to construct a set of more optimal PWMs. Results and Conclusion: Different methods for TF-binding fragments identification generate quite distinct sets of fragments. Thus, in case of 1uM RITA the methods SISSRs, MACS and KOLI demonstrated overlapping of 48.13% of fragments only. On the base of analyzed ChIP-seq data more effective PWMs for p53 were built. The sensitivity and False Discovery Rate were selected as a measure of accuracy of recognition procedures. According to our PWMs, the most typical motif of p53-binding sites is (A/C/T)NN(A/G)(G/a)(A/G)CATG(C/T)CCA(G/a)(A/g)CATG(C/t)(C/t)(C/t)NN. Analysis of p53-binding sites allowed to conclude that spacers of 6, 7 and 8bp are more considerable. Analysis of human 6th chromosome demonstrated that SINE/Alu and LINE/L1 repeats are also significantly enriched by p53-motifs. To avoid this effect we constructed additional PWMs specific to these repeats that allowed us to decrease essentially the false positives of p53-binding sites prediction by our method. Acknowledgements: This work was supported by EU grant №037590 “Net2Drug”. References. 1. E.A.Ananko et al. (2007) Recognition of interferon-inducible sites, promoters, and enhancers, BMC Bioinformatics, 8:56: 1-14.

integrated approach for modeling physiological, biomechanical, and

Related documents

Products

Support

integrated approach for modeling physiological, biomechanical, and

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib