reviews - Rtcb.iitp.ru

advertisement
Темы обзоров для зачета по сравнительной геномике
Списки литературы по темам даны для первоначального ознакомления с предметом и не
являются исчерпывающим; библиографический поиск является частью зачетной работы.
Если тема большая, можно ее взять на двоих или даже на троих, либо ограничиться какимто одним аспектом. Чтобы мотивировать такой вариант, надо представить полную
библиографию. Пример: терминаторы и идентификация альтернативных экзонов –
маленькие темы; промоторы, связь экзонов и доменов, микроРНК – большие.
Критерии оценки (в скобках – максимальный балл):
 Полнота библиографии (6)
 Структура обзора (6)
 Описание решаемой задачи (6)
 Качество (внятность) пересказа отдельных статей, описания Интернет-ресурсов (6)
 Рисунки и таблицы: наличие необходимых, уместность существующих, структура,
информативность (6)
 Собственные идеи и соображения: чем пользоваться (программы и базы данных)?
чему верить (статьи)? что еще было бы полезно сделать и/или проверить? (10)
Сумма за просто добросовестный обзор – 30, недостижимый идеал – 40.
Варианты сдачи:
 10-15-минутный рассказ на занятии (дату надо согласовать). За сдачу в октябре
премия 8, в ноябре – 4.
 Текст (срок сдачи полдень 9 декабря).
Штрафы за просрочку:
 Текст: после 9 декабря, но до начала зачетной сессии – 4
 Рассказ: после согласованной даты без предупреждения или уважительной
причины – 8
 После начала зачетной сессии, но до зачета или на зачете – 8
 После зачета, но до экзамена или на экзамене – 12
 Весенний семестр 2006 года – 16
 Весенняя сессия 2006 года – 20
 Позднее – очков не дается
Темы:
 Поиск регуляторных микроРНК животных и растений и их мишеней







Ohler U, Yekta S, Lim LP, Bartel DP, Burge CB. Patterns of flanking sequence conservation and a
characteristic upstream motif for microRNA gene identification. RNA. 2004 Sep;10(9):1309-22.
Lai EC, Tomancak P, Williams RW, Rubin GM. Computational identification of Drosophila microRNA
genes. Genome Biol. 2003;4(7):R42.
Enright AJ, John B, Gaul U, Tuschl T, Sander C, Marks DS. MicroRNA targets in Drosophila. Genome
Biol. 2003;5(1):R1.
Nam JW, Shin KR, Han J, Lee Y, Kim VN, Zhang BT. Human microRNA prediction through a
probabilistic co-learning model of sequence and structure. Nucleic Acids Res. 2005 Jun 24;33(11):3570-81.
Altuvia Y, Landgraf P, Lithwick G, Elefant N, Pfeffer S, Aravin A, Brownstein MJ, Tuschl T, Margalit H.
Clustering and conservation patterns of human microRNAs. Nucleic Acids Res. 2005 May 12;33(8):2697706.
Rodriguez A, Griffiths-Jones S, Ashurst JL, Bradley A. Identification of mammalian microRNA host
genes and transcription units. Genome Res. 2004 Oct;14(10A):1902-10.
Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. Identification of novel genes coding for small
expressed RNAs. Science. 2001 Oct 26;294(5543):853-8.









Li Y, Li W, Jin YX. Computational identification of novel family members of microRNA genes in
Arabidopsis thaliana and Oryza sativa. Acta Biochim Biophys Sin (Shanghai). 2005 Feb;37(2):75-87.
Mitreva M, McCarter JP, Martin J, Dante M, Wylie T, Chiapelli B, Pape D, Clifton SW, Nutman TB,
Waterston RH. Comparative genomics of gene expression in the parasitic and free-living nematodes
Strongyloides stercoralis and Caenorhabditis elegans. Genome Res. 2004 Feb;14(2):209-20.
Sempere LF, Freemantle S, Pitha-Rowe I, Moss E, Dmitrovsky E, Ambros V. Expression profiling of
mammalian microRNAs uncovers a subset of brain-expressed microRNAs with possible roles in murine
and human neuronal differentiation. Genome Biol. 2004;5(3):R13.
Rehmsmeier M, Steffen P, Hochsmann M, Giegerich R. Fast and effective prediction of microRNA/target
duplexes. RNA. 2004 Oct;10(10):1507-17.
Krichevsky AM, King KS, Donahue CP, Khrapko K, Kosik KS. A microRNA array reveals extensive
regulation of microRNAs during brain development. RNA. 2003 Oct;9(10):1274-81. Erratum in: RNA.
2004 Mar;10(3):551.
Krek A, Grun D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, MacMenamin P, da Piedade I, Gunsalus KC,
Stoffel M, Rajewsky N. Combinatorial microRNA target predictions. Nat Genet. 2005 May;37(5):495500.
Lagos-Quintana M, Rauhut R, Yalcin A, Meyer J, Lendeckel W, Tuschl T. Identification of tissue-specific
microRNAs from mouse. Curr Biol. 2002 Apr 30;12(9):735-9.
Zhang BH, Pan XP, Wang QL, Cobb GP, Anderson TA. Identification and characterization of new plant
microRNAs using EST analysis. Cell Res. 2005 May;15(5):336-60.
Xie X, Lu J, Kulbokas EJ, Golub TR, Mootha V, Lindblad-Toh K, Lander ES, Kellis M. Systematic
discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals.
Nature. 2005 Mar 17;434(7031):338-45.
 Сигналы регуляции альтернативного сплайсинга (энхансеры
сплайсинга и т.п.)








Ladd AN, Cooper TA. Finding signals that regulate alternative splicing in the post-genomic era. Genome
Biol. 2002 Oct 23;3(11):reviews0008.
Wang Z, Rolish ME, Yeo G, Tung V, Mawson M, Burge CB. Systematic identification and analysis of
exonic splicing silencers. Cell. 2004 Dec 17;119(6):831-45.
Fairbrother WG, Yeo GW, Yeh R, Goldstein P, Mawson M, Sharp PA, Burge CB. RESCUE-ESE
identifies candidate exonic splicing enhancers in vertebrate exons. Nucleic Acids Res. 2004 Jul 1;32(Web
Server issue):W187-90.
Minovitsky S, Gee SL, Schokrpur S, Dubchak I, Conboy JG. The splicing regulatory element, UGCAUG,
is phylogenetically and spatially conserved in introns that flank tissue-specific alternative exons. Nucleic
Acids Res. 2005 Feb 3;33(2):714-24.
Zhang XH, Chasin LA. Computational definition of sequence motifs governing constitutive exon splicing.
Genes Dev. 2004 Jun 1;18(11):1241-50.
Liu Y, Liu XS, Wei L, Altman RB, Batzoglou S. Eukaryotic regulatory element conservation analysis and
identification using comparative genomics. Genome Res. 2004 Mar;14(3):451-8.
Thanaraj TA, Clark F. Human GC-AG alternative intron isoforms with weak donor sites show enhanced
consensus at acceptor exon positions. Nucleic Acids Res. 2001 Jun 15;29(12):2581-93.
Cartegni L, Wang J, Zhu Z, Zhang MQ, Krainer AR. ESEfinder: A web resource to identify exonic
splicing enhancers. Nucleic Acids Res. 2003 Jul 1;31(13):3568-71.
 Предсказание оперонной структуры в бактериальных геномах






Nucleic Acids Res. 2001 Mar 1;29(5):1216-21. Prediction of operons in microbial genomes. Ermolaeva
MD, White O, Salzberg SL.
Genome Res. 2002 Aug;12(8):1221-30. Computational identification of operons in microbial genomes.
Zheng Y, Szustakowski JD, Fortnow L, Roberts RJ, Kasif S.
Westover BP, Buhler JD, Sonnenburg JL, Gordon JI. Operon prediction without a training set.
Bioinformatics. 2005 Apr 1;21(7):880-8.
Bioinformatics. 2004 Mar 22;20(5):709-17. Epub 2004 Jan 29. Using functional and organizational
information to improve genome-wide computational prediction of transcription units on pathway-genome
databases. Romero PR, Karp PD.
Nucleic Acids Res. 2004 Apr 19;32(7):2147-57. Print 2004. Operon prediction by comparative genomics:
an application to the Synechococcus sp. WH8102 genome. Chen X, Su Z, Dam P, Palenik B, Xu Y, Jiang
T.
Bioinformatics. 2002;18 Suppl 1:S329-36. A powerful non-homology method for the prediction of operons
in prokaryotes. Moreno-Hagelsieb G, Collado-Vides J.










Nucleic Acids Res. 2004 Jul 13;32(12):3689-702. Print 2004. Genome-wide operon prediction in
Staphylococcus aureus. Wang L, Trawick JD, Yamamoto R, Zamudio C.
Nucleic Acids Res. 2005 Feb 8;33(3):880-92. Print 2005. A novel method for accurate operon predictions
in all sequenced prokaryotes. Price MN, Huang KH, Alm EJ, Arkin AP.
Mol Biol Evol. 1999 Mar;16(3):332-46. Evolutionary instability of operon structures disclosed by sequence
comparisons of complete microbial genomes. Itoh T, Takemoto K, Mori H, Gojobori T.
Bioinformatics. 2003;19 Suppl 1:i34-43. Predicting bacterial transcription units using sequence and
expression data. Bockhorst J, Qiu Y, Glasner J, Liu M, Blattner F, Craven M.
Nucleic Acids Res. 2005 Jun 7;33(10):3253-62. Print 2005. A universally applicable method of operon
map prediction on minimally annotated genomes using conserved genomic context. Edwards MT, Rison
SC, Stoker NG, Wernisch L.
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W318-20. PredictRegulon: a web server for the
prediction of the regulatory protein binding sites and operons in prokaryote genomes. Yellaboina S,
Seshadri J, Kumar MS, Ranjan A.
Nucleic Acids Res. 2002 Jul 1;30(13):2886-93. Co-expression pattern from DNA microarray experiments
as a tool for operon prediction. Sabatti C, Rohlin L, Oh MK, Liao JC.
Bioinformatics. 2005 Apr 15;21(8):1403-7. Epub 2004 Nov 25. A fuzzy guided genetic algorithm for
operon prediction. Jacob E, Sasikumar R, Nair KN.
Bioinformatics. 2002;18 Suppl 1:S337-44. Identifying operons and untranslated regions of transcripts using
Escherichia coli RNA expression analysis. Tjaden B, Haynor DR, Stolyar S, Rosenow C, Kolker E.
Genome Res. 2005 Jun;15(6):809-19. Operon formation is driven by co-regulation and not by horizontal
gene transfer. Price MN, Huang KH, Arkin AP, Alm EJ.
 Поиск коротких РНКовых генов в бактериальных геномах








Bioinformatics. 2004 Mar 22;20(5):599-603. Epub 2004 Jan 22. Conservation analysis of small RNA genes
in Escherichia coli. Zhang Y, Zhang Z, Ling L, Shi B, Chen R.
Nucleic Acids Res. 2005 Jul 26;33(13):4096-105. Print 2005. sRNAPredict: an integrative computational
approach to identify sRNAs in bacterial genomes. Livny J, Fogel MA, Davis BM, Waldor MK.
Biosystems. 2002 Mar-May;65(2-3):157-77. A bioinformatics based approach to discover small RNA
genes in the Escherichia coli genome. Chen S, Lesnik EA, Hall TA, Sampath R, Griffey RH, Ecker DJ,
Blyn LB.
Curr Biol. 2001 Sep 4;11(17):1369-73. Computational identification of noncoding RNAs in E. coli by
comparative genomics. Rivas E, Klein RJ, Jones TA, Eddy SR.
Argaman L, Hershberg R, Vogel J, Bejerano G, Wagner EG, Margalit H, Altuvia S. Novel small RNAencoding genes in the intergenic regions of Escherichia coli. Curr Biol. 2001 Jun 26;11(12):941-50.
Washietl S, Hofacker IL. Consensus folding of aligned sequences as a new measure for the detection of
functional RNAs by comparative genomics. J Mol Biol. 2004 Sep 3;342(1):19-30.
Saetrom P, Sneve R, Kristiansen KI, Snove O Jr, Grunfeld T, Rognes T, Seeberg E. Predicting non-coding
RNA genes in Escherichia coli with boosted genetic programming. Nucleic Acids Res. 2005 Jun
7;33(10):3263-70.
Kawano M, Reynolds AA, Miranda-Rios J, Storz G. Detection of 5'- and 3'-UTR-derived small RNAs and
cis-encoded antisense RNAs in Escherichia coli. Nucleic Acids Res. 2005 Feb 17;33(3):1040-50.
 Предсказание терминаторов в бактериальных геномах







J Mol Biol. 2000 Aug 4;301(1):27-33. Prediction of transcription terminators in bacterial genomes.
Ermolaeva MD, Khalak HG, White O, Smith HO, Salzberg SL.
Nucleic Acids Res. 2002 Feb 1;30(3):675-84. Conserved economics of transcription termination in
eubacteria. Unniraman S, Prakash R, Nagaraja V.
Nucleic Acids Res. 1998 Dec 1;26(23):5456-63. Analysis of complete genomes suggests that many
prokaryotes do not rely on hairpin formation in transcription termination. Washio T, Sasayama J, Tomita
M.
Nucleic Acids Res. 2004 Apr 1;32(6):1973-81. Print 2004. Transcriptional organization of the Clostridium
acetobutylicum genome. Paredes CJ, Rigoutsos I, Papoutsakis ET.
J Mol Biol. 1990 Dec 20;216(4):835-58. Prediction of rho-independent Escherichia coli transcription
terminators. A statistical analysis of their RNA stem-loop structures. d'Aubenton Carafa Y, Brody E,
Thermes C.
Nucleic Acids Res. 2001 Sep 1;29(17):3583-94. Prediction of rho-independent transcriptional terminators
in Escherichia coli. Lesnik EA, Sampath R, Levene HB, Henderson TJ, McNeil JA, Ecker DJ.
J Biomol Struct Dyn. 1986 Feb;3(4):705-23. Terminators of transcription with RNA polymerase from
Escherichia coli: what they look like and how to find them. Brendel V, Hamm GH, Trifonov EN.
 Предсказание промоторов в бактериальных геномах



























http://www.biomedcentral.com/1471-2105/6/1
Kiryu H, Oshima T, Asai K. Extracting relations between promoter sequences and their strengths from
microarray data. Bioinformatics. 2005 Apr 1;21(7):1062-8.
Sheng Wu Hua Xue Yu Sheng Wu Wu Li Xue Bao (Shanghai). 2003 Apr;35(4):317-24. Prediction of
prokaryotic promoters based on prediction of transcriptional units. Lin JC, Xu JL, Luo JH, Li YX.
Nucleic Acids Res. 2004 Apr 1;32(6):1973-81. Print 2004. Transcriptional organization of the Clostridium
acetobutylicum genome. Paredes CJ, Rigoutsos I, Papoutsakis ET.
Kanhere A, Bansal M. A novel method for prokaryotic promoter prediction based on DNA stability. BMC
Bioinformatics. 2005 Jan 5;6(1):1.
Vanet A, Marsan L, Labigne A, Sagot MF. Inferring regulatory elements from a whole genome. An
analysis of Helicobacter pylori sigma(80) family of promoter signals. J Mol Biol. 2000 Mar 24;297(2):33553.
Prakash O, Guntaka RV, Sarkar NH. Evidence for a prokaryotic promoter in the murine mammary tumor
virus long terminal repeat. Gene. 1983 Aug;23(2):117-30.
Gordon L, Chervonenkis AY, Gammerman AJ, Shahmuradov IA, Solovyev VV. Sequence alignment
kernel for recognition of promoter regions. Bioinformatics. 2003 Oct 12;19(15):1964-71.
Burden S, Lin YX, Zhang R. Improving promoter prediction for the NNPP2.2 algorithm: a case study using
Escherichia coli DNA sequences. Bioinformatics. 2005 Mar 1;21(5):601-7.
Horton PB, Kanehisa M. An assessment of neural network and statistical approaches for prediction of E.
coli promoter sites. Nucleic Acids Res. 1992 Aug 25;20(16):4331-8.
Alexandrov NN, Mironov AA. Application of a new method of pattern recognition in DNA sequence
analysis: a study of E. coli promoters. Nucleic Acids Res. 1990 Apr 11;18(7):1847-52.
Leung S, Mellish C, Robertson D. Basic Gene Grammars and DNA-ChartParser for language processing
of Escherichia coli promoter DNA sequences. Bioinformatics. 2001 Mar;17(3):226-36.
Rozkot F, Sazelova P, Pivec L. A novel method for promoter search enhanced by function-specific
subgrouping of promoters--developed and tested on E.coli system. Nucleic Acids Res. 1989 Jun
26;17(12):4799-815.
Hertz GZ, Stormo GD. Escherichia coli promoter sequences: analysis and prediction. Methods Enzymol.
1996;273:30-42.
Kalate RN, Tambe SS, Kulkarni BD. Artificial neural networks for prediction of mycobacterial promoter
sequences. Comput Biol Chem. 2003 Dec;27(6):555-64.
Nakata K, Kanehisa M, Maizel JV Jr. Discriminant analysis of promoter regions in Escherichia coli
sequences. Comput Appl Biosci. 1988 Aug;4(3):367-71.
Oliphant AR, Struhl K. Defining the consensus sequences of E.coli promoter elements by random
selection. Nucleic Acids Res. 1988 Aug 11;16(15):7673-83.
Staden R. Methods for discovering novel motifs in nucleic acid sequences. Comput Appl Biosci. 1989
Oct;5(4):293-8.
Tung CS, Harvey SC. A common structural feature in promoter sequences of E. coli. Nucleic Acids Res.
1987 Jun 25;15(12):4973-85.
Bourn WR, Babb B. Computer assisted identification and classification of streptomycete promoters.
Nucleic Acids Res. 1995 Sep 25;23(18):3696-703.
Patek M, Eikmanns BJ, Patek J, Sahm H. Promoters from Corynebacterium glutamicum: cloning,
molecular analysis and search for a consensus motif. Microbiology. 1996 May;142 ( Pt 5):1297-309.
Mulligan ME, Hawley DK, Entriken R, McClure WR. Escherichia coli promoter sequences predict in vitro
RNA polymerase selectivity. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):789-800.
Hershberg R, Bejerano G, Santos-Zavaleta A, Margalit H. PromEC: An updated database of Escherichia
coli mRNA promoters with experimentally identified transcriptional start sites. Nucleic Acids Res. 2001
Jan 1;29(1):277.
Strohl WR. Compilation and analysis of DNA sequences associated with apparent streptomycete
promoters. Nucleic Acids Res. 1992 Mar 11;20(5):961-74.
Wosten MM, Boeve M, Koot MG, van Nuene AC, van der Zeijst BA. Identification of Campylobacter
jejuni promoter sequences. J Bacteriol. 1998 Feb;180(3):594-9.
Gaal T, Ross W, Estrem ST, Nguyen LH, Burgess RR, Gourse RL. Promoter recognition and
discrimination by EsigmaS RNA polymerase. Mol Microbiol. 2001 Nov;42(4):939-54.
Ozoline ON, Deev AA, Arkhipova MV. Non-canonical sequence elements in the promoter structure.
Cluster analysis of promoters recognized by Escherichia coli RNA polymerase. Nucleic Acids Res. 1997
Dec 1;25(23):4703-9.



Petersen L, Larsen TS, Ussery DW, On SL, Krogh A. RpoD promoters in Campylobacter jejuni exhibit a
strong periodic signal instead of a -35 box. J Mol Biol. 2003 Mar 7;326(5):1361-72.
Barrios H, Valderrama B, Morett E. Compilation and analysis of sigma(54)-dependent promoter
sequences. Nucleic Acids Res. 1999 Nov 15;27(22):4305-13.
Dartigalongue C, Missiakas D, Raina S. Characterization of the Escherichia coli sigma E regulon. J Biol
Chem. 2001 Jun 15;276(24):20866-75.
 Positional clustering (позиционный анализ)













Genome Res. 2001 Mar;11(3):356-72. Genome alignment, evolution of prokaryotic genome organization,
and prediction of gene function using genomic context. Wolf YI, Rogozin IB, Kondrashov AS, Koonin EV.
Nucleic Acids Res. 2005 May 2;33(8):2521-30. Print 2005. Nebulon: a system for the inference of
functional relationships of gene products from the rearrangement of predicted operons. Janga SC, ColladoVides J, Moreno-Hagelsieb G.
Nucleic Acids Res. 2000 Oct 15;28(20):4029-36. Automatic detection of conserved gene clusters in
multiple genomes by graph comparison and P-quasi grouping. Fujibuchi W, Ogata H, Matsuda H, Kanehisa
M.
J Mol Biol. 2001 Aug 24;311(4):639-56. SNAPping up functionally related genes based on context
information: a colinearity-free approach. Kolesov G, Mewes HW, Frishman D.
In Silico Biol. 1999;1(2):93-108. Use of contiguity on the chromosome to predict functional coupling.
Overbeek R, Fonstein M, D'Souza M, Pusch GD, Maltsev N.
Genome Biol. 2001;2(6):RESEARCH0020. Epub 2001 Jun 1. Evolution of gene order conservation in
prokaryotes. Tamames J.
Genetics. 1996 Aug;143(4):1843-60. Selfish operons: horizontal transfer may drive the evolution of gene
clusters. Lawrence JG, Roth JR.
Annu Rev Microbiol. 2003;57:419-40. Gene organization: selection, selfishness, and serendipity. Lawrence
JG.
Genome Biol. 2003;4(9):R55. Epub 2003 Aug 29. Evolution of mosaic operons by horizontal gene transfer
and gene displacement in situ. Omelchenko MV, Makarova KS, Wolf YI, Rogozin IB, Koonin EV.
Genome Res. 2003 May;13(5):991-8. Epub 2003 Apr 14. Comparing bacterial genomes through
conservation profiles. Martin MJ, Herrero J, Mateos A, Dopazo J.
Genome Res. 2000 Aug;10(8):1204-10. Predicting protein function by genomic context: quantitative
evaluation and qualitative inferences. Huynen M, Snel B, Lathe W 3rd, Bork P.
Nucleic Acids Res. 2002 May 15;30(10):2212-23. Connected gene neighborhoods in prokaryotic genomes.
Rogozin IB, Makarova KS, Murvai J, Czabarka E, Wolf YI, Tatusov RL, Szekely LA, Koonin EV.
Trends Genet. 2004 Mar;20(3):126-31. Global analysis of bacterial transcription factors to predict cellular
target processes. Doerks T, Andrade MA, Lathe W 3rd, von Mering C, Bork P.
 Phylogenetic profiling (филогенетические профили)











http://bioinformatics.icmb.utexas.edu/plex
Genome Res. 2003 May;13(5):991-8. Epub 2003 Apr 14. Comparing bacterial genomes through
conservation profiles. Martin MJ, Herrero J, Mateos A, Dopazo J.
Genome Biol. 2004;5(5):R32. Epub 2004 Apr 27. Detection of evolutionarily stable fragments of cellular
pathways by hierarchical clustering of phyletic patterns. Glazko GV, Mushegian AR.
Genome Res. 2004 Mar;14(3):391-7. Quantifying modularity in the evolution of biomolecular systems.
Snel B, Huynen MA.
Genome Res. 2000 Aug;10(8):1204-10. Predicting protein function by genomic context: quantitative
evaluation and qualitative inferences. Huynen M, Snel B, Lathe W 3rd, Bork P.
Bioinformatics. 2003 Aug 12;19(12):1524-30. Identification of functional links between genes using
phylogenetic profiles. Wu J, Kasif S, DeLisi C.
Nat Biotechnol. 2003 Sep;21(9):1055-62. Epub 2003 Aug 17. Discovery of uncharacterized cellular
systems by genome-wide analysis of functional linkages. Date SV, Marcotte EM.
Bioinformatics. 2005 Feb 15;21(4):464-70. Epub 2004 Sep 16. Improving genome annotations using
phylogenetic profile anomaly detection. Mikkelsen TS, Galagan JE, Mesirov JP.
Gutierrez RA, Green PJ, Keegstra K, Ohlrogge JB. Phylogenetic profiling of the Arabidopsis thaliana
proteome: what proteins distinguish plants from other organisms? Genome Biol. 2004;5(8):R53.
Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO. Assigning protein functions by
comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci U S A. 1999 Apr
13;96(8):4285-8.
Makarova KS, Wolf YI, Koonin EV. Potential genomic determinants of hyperthermophily. Trends Genet.
2003 Apr;19(4):172-6.
 Phylogenetic footprinting (сравнительный анализ регуляторных
сайтов): программы и результаты – бактерии и эукариоты (растения,
дрожжи, нематоды, дрозофилы, позвоночные)

























Genome Res. 2002 Oct;12(10):1523-32. Factors influencing the identification of transcription factor
binding sites by cross-species comparison. McCue LA, Thompson W, Carmack CS, Lawrence CE.
Genome Res. 2004 Jul;14(7):1362-73. Regulog analysis: detection of conserved regulatory networks across
bacteria: application to Staphylococcus aureus. Alkema WB, Lenhard B, Wasserman WW.
Wang T, Stormo GD. Combining phylogenetic data with co-regulated genes to identify regulatory
motifs. Bioinformatics. 2003 Dec 12;19(18):2369-80.
Narasimhan C, LoCascio P, Uberbacher E. Background rareness-based iterative multiple sequence
alignment algorithm for regulatory element detection. Bioinformatics. 2003 Oct 12;19(15):1952-63.
Qin ZS, McCue LA, Thompson W, Mayerhofer L, Lawrence CE, Liu JS. Identification of co-regulated
genes through Bayesian clustering of predicted regulatory binding sites. Nat Biotechnol. 2003
Apr;21(4):435-9.
Sharan R, Ovcharenko I, Ben-Hur A, Karp RM. CREME: a framework for identifying cis-regulatory
modules in human-mouse conserved segments. Bioinformatics. 2003;19 Suppl 1:i283-91.
McGuire AM, Hughes JD, Church GM. Conservation of DNA regulatory motifs and discovery of new
motifs in microbial genomes. Genome Res. 2000 Jun;10(6):744-57.
Blanchette M, Tompa M. Discovery of regulatory elements by a computational method for phylogenetic
footprinting. Genome Res. 2002 May;12(5):739-48.
GuhaThakurta D, Palomar L, Stormo GD, Tedesco P, Johnson TE, Walker DW, Lithgow G, Kim S, Link
CD. Identification of a novel cis-regulatory element involved in the heat shock response in Caenorhabditis
elegans using microarray gene expression and computational methods. Genome Res. 2002 May;12(5):70112. Erratum in: Genome Res 2002 Aug;12(8):1301.
Guo H, Moose SP. Conserved noncoding sequences among cultivated cereal genomes identify candidate
regulatory sequence elements and patterns of promoter evolution. Plant Cell. 2003 May;15(5):1143-58.
Schwartz S, Elnitski L, Li M, Weirauch M, Riemer C, Smit A, Green ED, Hardison RC, Miller W; NISC
Comparative Sequencing Program. MultiPipMaker and supporting tools: Alignments and analysis of
multiple genomic DNA sequences. Nucleic Acids Res. 2003 Jul 1;31(13):3518-24.
Cliften P, Sudarsanam P, Desikan A, Fulton L, Fulton B, Majors J, Waterston R, Cohen BA, Johnston M.
Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science. 2003 Jul
4;301(5629):71-6. Epub 2003 May 29.
Lenhard B, Sandelin A, Mendoza L, Engstrom P, Jareborg N, Wasserman WW. Identification of
conserved regulatory elements by comparative genome analysis. J Biol. 2003;2(2):13. Epub 2003 May 22.
Shah N, Couronne O, Pennacchio LA, Brudno M, Batzoglou S, Bethel EW, Rubin EM, Hamann B,
Dubchak I. Phylo-VISTA: interactive visualization of multiple DNA sequence alignments. Bioinformatics.
2004 Mar 22;20(5):636-43.
Corcoran DL, Feingold E, Benos PV. FOOTER: a web tool for finding mammalian DNA regulatory
regions using phylogenetic footprinting. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W442-6.
Berezikov E, Guryev V, Cuppen E. CONREAL web server: identification and visualization of conserved
transcription factor binding sites. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W447-50.
La D, Livesay DR. MINER: software for phylogenetic motif identification. Nucleic Acids Res. 2005 Jul
1;33(Web Server issue):W267-70.
Florea L, McClelland M, Riemer C, Schwartz S, Miller W. EnteriX 2003: Visualization tools for genome
alignments of Enterobacteriaceae. Nucleic Acids Res. 2003 Jul 1;31(13):3527-32.
Solovyev VV, Shahmuradov IA. PromH: Promoters identification using orthologous genomic sequences.
Nucleic Acids Res. 2003 Jul 1;31(13):3540-5.
Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES. Sequencing and comparison of yeast species to
identify genes and regulatory elements. Nature. 2003 May 15;423(6937):241-54.
Doniger SW, Huh J, Fay JC. Identification of functional transcription factor binding sites using closely
related Saccharomyces species. Genome Res. 2005 May;15(5):701-9.
Dermitzakis ET, Clark AG. Evolution of transcription factor binding sites in Mammalian gene regulatory
regions: conservation and turnover. Mol Biol Evol. 2002 Jul;19(7):1114-21.
Manson McGuire A, Church GM. Predicting regulons and their cis-regulatory motifs by comparative
genomics. Nucleic Acids Res. 2000 Nov 15;28(22):4523-30.
Boffelli D, McAuliffe J, Ovcharenko D, Lewis KD, Ovcharenko I, Pachter L, Rubin EM. Phylogenetic
shadowing of primate sequences to find functional regions of the human genome. Science. 2003 Feb
28;299(5611):1391-4.



McCue L, Thompson W, Carmack C, Ryan MP, Liu JS, Derbyshire V, Lawrence CE. Phylogenetic
footprinting of transcription factor binding sites in proteobacterial genomes. Nucleic Acids Res. 2001 Feb
1;29(3):774-82.
Stojanovic N, Florea L, Riemer C, Gumucio D, Slightom J, Goodman M, Miller W, Hardison R.
Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory
regions. Nucleic Acids Res. 1999 Oct 1;27(19):3899-910.
Liu J, Tan K, Stormo GD. Computational identification of the Spo0A-phosphate regulon that is essential
for the cellular differentiation and development in Gram-positive spore-forming bacteria. Nucleic Acids
Res. 2003 Dec 1;31(23):6891-903.
 Базы данных ортологов









Dufayard JF, Duret L, Penel S, Gouy M, Rechenmann F, Perriere G. Tree pattern matching in
phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases.
Bioinformatics. 2005 Jun 1;21(11):2596-603.
Cannon SB, Young ND. OrthoParaMap: distinguishing orthologs from paralogs by integrating
comparative genome data and gene phylogenies. BMC Bioinformatics. 2003 Sep 2;4:35.
Nucleic Acids Res. 2000 Jan 1;28(1):33-6. The COG database: a tool for genome-scale analysis of protein
functions and evolution. Tatusov RL, Galperin MY, Natale DA, Koonin EV.
Genome Res. 2003 Sep;13(9):2178-89. OrthoMCL: identification of ortholog groups for eukaryotic
genomes. Li L, Stoeckert CJ Jr, Roos DS.
Hum Mutat. 2004 Aug;24(2):112-9. OrthoDisease: a database of human disease orthologs. O'Brien KP,
Westerlund I, Sonnhammer EL.
J Mol Biol. 2001 Dec 14;314(5):1041-52. Automatic clustering of orthologs and in-paralogs from pairwise
species comparisons. Remm M, Storm CE, Sonnhammer EL.
Yuan YP, Eulenstein O, Vingron M, Bork P. Towards detection of orthologues in sequence databases.
Bioinformatics. 1998;14(3):285-9.
Leveugle M, Prat K, Perrier N, Birnbaum D, Coulier F. ParaDB: a tool for paralogy mapping in vertebrate
genomes. Nucleic Acids Res. 2003 Jan 1;31(1):63-7.
HOMOLOGENE (NCBI)
 Базы данных метаболических путей





Pathway Hunter Tool: hhtp://www.pht.uni-koeln.de
Genome Res. 1998 Mar;8(3):203-10. Reconstruction of amino acid biosynthesis pathways from the
complete genome sequence. Bono H, Ogata H, Goto S, Kanehisa M.
IEEE Trans Nanobioscience. 2004 Sep;3(3):192-9. Web-based information retrieval system for the
prediction of metabolic pathways. Chen M, Hofestadt R.
KEGG
MetaCyc и его частные инкарнации (EcoCyc etc.)
 Gene and protein fusions








Nucleic Acids Res. 2004 Jan 1;32(Database issue):D273-6. FusionDB: a database for in-depth analysis of
prokaryotic gene fusion events. Suhre K, Claverie JM.
Genome Biol. 2002;3(5):research0024. Epub 2002 Apr 26. Evolution of gene fusions: horizontal transfer
versus independent events. Yanai I, Wolf YI, Koonin EV.
Gene. 2003 Dec 24;323:181-7. Erratum in: Gene. 2004 Jun 14;330:185. On dynamics of overlapping
genes in bacterial genomes. Fukuda Y, Nakayama Y, Tomita M.
Genome Res. 2004 Nov;14(11):2268-72. Properties of overlapping genes are conserved across microbial
genomes. Johnson ZI, Chisholm SW.
Genome Res. 2000 Aug;10(8):1204-10. Predicting protein function by genomic context: quantitative
evaluation and qualitative inferences. Huynen M, Snel B, Lathe W 3rd, Bork P.
Proc Natl Acad Sci U S A. 2001 Jul 3;98(14):7940-5. Genes linked by fusion events are generally of the
same functional category: a systematic analysis of 30 microbial genomes. Yanai I, Derti A, DeLisi C.
Genome Biol. 2000;1(6):RESEARCH0013. Epub 2000 Dec 4. Interkingdom gene fusions. Wolf YI,
Kondrashov AS, Koonin EV.
Nature. 1999 Nov 4;402(6757):86-90. Comment in: Nature. 1999 Nov 4;402(6757):23, 25-6. Protein
interaction maps for complete genomes based on gene fusion events. Enright AJ, Iliopoulos I, Kyrpides
NC, Ouzounis CA.
 Предсказание митохондриальной локализации и базы данных
митохондриальных белков













Cameron JM, Hurd T, Robinson BH. Computational identification of human mitochondrial proteins based
on homology to yeast mitochondrially targeted proteins. Bioinformatics. 2005 May 1;21(9):1825-30.
Marcotte EM, Xenarios I, van Der Bliek AM, Eisenberg D. Localizing proteins in the cell from their
phylogenetic profiles. Proc Natl Acad Sci U S A. 2000 Oct 24;97(22):12115-20.
PSORTII
TargetP
Mitoprot
Mitopred
MITOMAP
MITOP
Chou KC, Cai YD. Predicting protein localization in budding yeast. Bioinformatics. 2005 Apr
1;21(7):944-50.
Huang Y, Li Y. Prediction of protein subcellular locations using fuzzy k-NN method. Bioinformatics.
2004 Jan 1;20(1):21-8.
Plasterer TN, Smith TF, Mohr SC. Survey of human mitochondrial diseases using new genomic/proteomic
tools. Genome Biol. 2001;2(6):RESEARCH0021. Epub 2001 Jun 1.
Andreoli C, Prokisch H, Hortnagel K, Mueller JC, Munsterkotter M, Scharfe C, Meitinger T. MitoP2, an
integrated database on mitochondrial proteins in yeast and man. Nucleic Acids Res. 2004 Jan
1;32(Database issue):D459-62.
Attimonelli M, Catalano D, Gissi C, Grillo G, Licciulli F, Liuni S, Santamaria M, Pesole G, Saccone C.
MitoNuc: a database of nuclear genes coding for mitochondrial proteins. Update 2002. Nucleic Acids Res.
2002 Jan 1;30(1):172-3.
 Метагеномы






Tringe SG, von Mering C, Kobayashi A, Salamov AA, Chen K, Chang HW, Podar M, Short JM, Mathur
EJ, Detter JC, Bork P, Hugenholtz P, Rubin EM. Comparative metagenomics of microbial communities.
Science. 2005 Apr 22;308(5721):554-7.
Ram RJ, Verberkmoes NC, Thelen MP, Tyson GW, Baker BJ, Blake RC 2nd, Shah M, Hettich RL,
Banfield JF. Community proteomics of a natural microbial biofilm. Science. 2005 Jun 24;308(5730):191520.
Eckburg PB, Bik EM, Bernstein CN, Purdom E, Dethlefsen L, Sargent M, Gill SR, Nelson KE, Relman
DA. Diversity of the human intestinal microbial flora. Science. 2005 Jun 10;308(5728):1635-8.
Backhed F, Ley RE, Sonnenburg JL, Peterson DA, Gordon JI. Host-bacterial mutualism in the human
intestine. Science. 2005 Mar 25;307(5717):1915-20. Review.
Science. 2004 Apr 2;304(5667):66-74. Epub 2004 Mar 4. Comment in: Science. 2004 Apr ;304(5667):5860. Environmental genome shotgun sequencing of the Sargasso Sea. Venter JC, et al.
Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, Solovyev VV, Rubin EM,
Rokhsar DS, Banfield JF. Community structure and metabolism through reconstruction of microbial
genomes from the environment. Nature. 2004 Mar 4;428(6978):37-43.
 Распределение числа укладок в структурных семействах, укладок,
доменов и генов в геномах







Huynen MA, van Nimwegen E. The frequency distribution of gene family sizes in complete genomes. Mol
Biol Evol. 1998 May;15(5):583-9.
Yanai I, Camacho CJ, DeLisi C. Predictions of gene family distributions in microbial genomes: evolution
by gene duplication and modification. Phys Rev Lett. 2000 Sep 18;85(12):2641-4.
Koonin EV, Wolf YI, Karev GP. The structure of the protein universe and genome evolution. Nature. 2002
Nov 14;420(6912):218-23.
Unger R, Uliel S, Havlin S. Scaling law in sizes of protein sequence families: from super-families to
orphan genes. Proteins. 2003 Jun 1;51(4):569-76.
Genome Biol. 2004;5(4):R27. Epub 2004 Mar 18. Comparative genomics of gene-family size in closely
related bacteria. Pushker R, Mira A, Rodriguez-Valera F.
J Mol Biol. 2000 Jun 16;299(4):897-905. Estimating the number of protein folds and families from
complete genome data. Wolf YI, Grishin NV, Koonin EV.
Phys Rev Lett. 2000 Sep 18;85(12):2641-4. Predictions of gene family distributions in microbial genomes:
evolution by gene duplication and modification. Yanai I, Camacho CJ, DeLisi C.







Perez-Rueda E, Collado-Vides J, Segovia L. Phylogenetic distribution of DNA-binding transcription
factors in bacteria and archaea. Comput Biol Chem. 2004 Dec;28(5-6):341-50.
Wolf YI, Grishin NV, Koonin EV. Estimating the number of protein folds and families from complete
genome data. J Mol Biol. 2000 Jun 16;299(4):897-905.
Abeln S, Deane CM. Fold usage on genomes and protein fold evolution. Proteins. 2005 Jul 6; [Epub ahead
of print]
Frishman D, Mewes HW. Protein structural classes in five complete genomes. Nat Struct Biol. 1997
Aug;4(8):626-8.
Zhang C, DeLisi C. Estimating the number of protein folds. J Mol Biol. 1998 Dec 18;284(5):1301-5.
Luscombe NM, Qian J, Zhang Z, Johnson T, Gerstein M. The dominance of the population by a selected
few: power-law behaviour applies to a wide variety of genomic properties. Genome Biol. 2002 Jul
25;3(8):RESEARCH0040.
Ranea JA, Buchan DW, Thornton JM, Orengo CA. Evolution of protein superfamilies and bacterial
genome size. J Mol Biol. 2004 Feb 27;336(4):871-87.
 Селеноцистеин, белки, его содержащие, сигналы вставки SeCys в
белки (SECIS)













Biofactors. 2001;14(1-4):61-8. An extended Escherichia coli "selenocysteine insertion sequence" (SECIS)
as a multifunctional RNA structure. Engelberg-Kulka H, Liu Z, Li C, Reches M.
Taskov K, Chapple C, Kryukov GV, Castellano S, Lobanov AV, Korotkov KV, Guigo R, Gladyshev VN.
Nematode selenoproteome: the use of the selenocysteine insertion system to decode one codon in an animal
genome? Nucleic Acids Res. 2005 Apr 20;33(7):2227-38. Print 2005.
Zhang Y, Fomenko DE, Gladyshev VN. The microbial selenoproteome of the Sargasso Sea. Genome Biol.
2005;6(4):R37.
Zhang Y, Gladyshev VN. An algorithm for identification of bacterial selenocysteine insertion sequence
elements and selenoprotein genes. Bioinformatics. 2005 Jun 1;21(11):2580-9.
Shrimali RK, Lobanov AV, Xu XM, Rao M, Carlson BA, Mahadeo DC, Parent CA, Gladyshev VN,
Hatfield DL. Selenocysteine tRNA identification in the model organisms Dictyostelium discoideum and
Tetrahymena thermophila. Biochem Biophys Res Commun. 2005 Apr 1;329(1):147-51.
Kryukov GV, Gladyshev VN. The prokaryotic selenoproteome. EMBO Rep. 2004 May;5(5):538-43.
Kryukov GV, Castellano S, Novoselov SV, Lobanov AV, Zehtab O, Guigo R, Gladyshev VN.
Characterization of mammalian selenoproteomes. Science. 2003 May 30;300(5624):1439-43.
Kryukov GV, Gladyshev VN. Mammalian selenoprotein gene signature: identification and functional
analysis of selenoprotein genes using bioinformatics methods. Methods Enzymol. 2002;347:84-100.
Busch A, Will S, Backofen R. SECISDesign: a server to design SECIS-elements within the coding
sequence. Bioinformatics. 2005 Aug 1;21(15):3312-3.
Mourier T, Pain A, Barrell B, Griffiths-Jones S. A selenocysteine tRNA and SECIS element in
Plasmodium falciparum. RNA. 2005 Feb;11(2):119-22.
Lambert A, Lescure A, Gautheret D. A survey of metazoan selenocysteine insertion sequences. Biochimie.
2002 Sep;84(9):953-9.
Castellano S, Morozova N, Morey M, Berry MJ, Serras F, Corominas M, Guigo R. In silico identification
of novel selenoproteins in the Drosophila melanogaster genome. EMBO Rep. 2001 Aug;2(8):697-702.
Lescure A, Gautheret D, Carbon P, Krol A. Novel selenoproteins identified in silico and in vivo by using a
conserved RNA structural motif. J Biol Chem. 1999 Dec 31;274(53):38147-54.
 Эволюция экзон-интронной структуры. Есть ли корреляция с
доменной структурой белков?






Nielsen CB, Friedman B, Birren B, Burge CB, Galagan JE. Patterns of intron gain and loss in fungi. PLoS
Biol. 2004 Dec;2(12):e422.
Parra M, Gee S, Chan N, Ryaboy D, Dubchak I, Mohandas N, Gascard PD, Conboy JG. Differential
domain evolution and complex RNA processing in a family of paralogous EPB41 (protein 4.1) genes
facilitate expression of diverse tissue-specific isoforms. Genomics. 2004 Oct;84(4):637-46.
Roy SW, Gilbert W. The pattern of intron loss. Proc Natl Acad Sci U S A. 2005 Jan 18;102(3):713-8.
Sverdlov AV, Babenko VN, Rogozin IB, Koonin EV. Preferential loss and gain of introns in 3' portions of
genes suggests a reverse-transcription mechanism of intron insertion. Gene. 2004 Aug 18;338(1):85-91.
Niu DK, Hou WR, Li SW. mRNA-mediated intron losses: evidence from extraordinarily large exons. Mol
Biol Evol. 2005 Jun;22(6):1475-81.
Feiber AL, Rangarajan J, Vaughn JC. The evolution of single-copy Drosophila nuclear 4f-rnp genes:
spliceosomal intron losses create polymorphic alleles. J Mol Evol. 2002 Oct;55(4):401-13.



Robertson HM. The large srh family of chemoreceptor genes in Caenorhabditis nematodes reveals
processes of genome evolution involving large duplications and deletions and intron gains and losses.
Genome Res. 2000 Feb;10(2):192-203.
Bon E, Casaregola S, Blandin G, Llorente B, Neuveglise C, Munsterkotter M, Guldener U, Mewes HW,
Van Helden J, Dujon B, Gaillardin C. Molecular evolution of eukaryotic genomes: hemiascomycetous
yeast spliceosomal introns. Nucleic Acids Res. 2003 Feb 15;31(4):1121-35.
Long M, Rosenberg C. Testing the "proto-splice sites" model of intron origin: evidence from analysis of
intron phase correlations. Mol Biol Evol. 2000 Dec;17(12):1789-96.
 Эволюция альтернативного сплайсинга. Функциональны ли
неконсервативные альтернативные экзоны?







Resch A, Xing Y, Alekseyenko A, Modrek B, Lee C. Evidence for a subpopulation of conserved
alternative splicing events under selection pressure for protein reading frame preservation. Nucleic Acids
Res. 2004 Feb 24;32(4):1261-9.
Sugnet CW, Kent WJ, Ares M Jr, Haussler D. Transcriptome and genome conservation of alternative
splicing events in humans and mice. Pac Symp Biocomput. 2004;:66-77.
Pan Q, Bakowski MA, Morris Q, Zhang W, Frey BJ, Hughes TR, Blencowe BJ. Alternative splicing of
conserved exons is frequently species-specific in human and mouse. Trends Genet. 2005 Feb;21(2):73-7.
Nurtdinov RN, Artamonova II, Mironov AA, Gelfand MS. Low conservation of alternative splicing
patterns in the human and mouse genomes. Hum Mol Genet. 2003 Jun 1;12(11):1313-20.
Sorek R, Shamir R, Ast G. How prevalent is functional alternative splicing in the human genome? Trends
Genet. 2004 Feb;20(2):68-71.
Thanaraj TA, Clark F, Muilu J. Conservation of human alternative splice events in mouse. Nucleic Acids
Res. 2003 May 15;31(10):2544-52.
Parra M, Gee S, Chan N, Ryaboy D, Dubchak I, Mohandas N, Gascard PD, Conboy JG. Differential
domain evolution and complex RNA processing in a family of paralogous EPB41 (protein 4.1) genes
facilitate expression of diverse tissue-specific isoforms. Genomics. 2004 Oct;84(4):637-46.
 Идентификация альтернативных экзонов




Yeo GW, Van Nostrand E, Holste D, Poggio T, Burge CB. Identification and analysis of alternative
splicing events conserved in human and mouse. Proc Natl Acad Sci U S A. 2005 Feb 22;102(8):2850-5.
Sorek R, Shemesh R, Cohen Y, Basechess O, Ast G, Shamir R. A non-EST-based method for exonskipping prediction. Genome Res. 2004 Aug;14(8):1617-23.
Kan Z, Castle J, Johnson JM, Tsinoremas NF. Detection of novel splice forms in human and mouse using
cross-species approach. Pac Symp Biocomput. 2004;:42-53.
Philipps DL, Park JW, Graveley BR. A computational and experimental approach toward a priori
identification of alternatively spliced exons. RNA. 2004 Dec;10(12):1838-44.
Download