Evolutionary fate of CpG and CNG clusters in mammals

advertisement
Comparative analysis of CpG and CNG clusters in the vicinity of
mammalian orthologous genes.
Oparina Nina1,2*, Fridman Marina2, Makeev Vsevolod2
1
Engelhardt Institute of Molecular Biology, RAS, Moscow, Russia
oparina@gmail.com
2
Institute of genetics and selection of industrial microorganisms, GosNIIgenetika, Moscow,
Russia
e-mail: oparina@gmail.com
*corresponding author
Motivation and Aim: Mammalian genomes contain less CpG dinucleotides due to demethylation.
By the way, multiple clusters of CpGs traditionally called CpG-islands are frequent in these
genomes. It was shown that CpGs remain mutable even in CpG islands besides they are
considered to be unmethylated there. The recent data demonstrated that CpGs remain highly
mutable even in CpG islands and in several phyla CpG-islands are depleted and shortened.
Nevertheless, even in these genomes there’re still a lot of CpG islands. Multiple algorythms were
used for CpG-islands detection in the genome. Nevertheless there still no good ideas for
detection of functional CpG-islands – we still don’t know there exact fucntions.
Methods and Algorithms: We used UCSC GenomeBrowser and Galaxy, RepeatMasker Web
Server, Bioconductor and R for clusterization tasks, CpGcluster, CpGProD? Evola and RoundUp
orthologs databases and some others.
Results: We have found that despite overall mutability numerous CpG-enriched clusters were
found in the vicinity of orthologous mammalian genes, including there promotor regions, 3’ends, UTRs. In several cases orthologous introns containing both CpG-clusters and mammalianconserved elements were detected. Neary half of such sequences were highly conserved in
mammals but the others were not similar but were surrounded by conserved elements. We have
classified CpG-clusters including: non-conserved, located at the same orthologous loci (with no
sequence conservation of CpG island), and highly conserved. Conserved CpG islands were
detected mostly in gene-overlapping loci, while a subset of intergenic conserved CpG was
described. Surprizingly, the most conserved CpG-islands were enriched with CNG motifs. We
have compared CNG content in various CpG-clusters and found that they are also surprizingly
enriched in germline methylated islands.
Conclusion: CpG-enriched sequences, both detected by modern CpG-island detection methods
and small or GC-poor undetected are frequently conserved and/or located in the vicinity of the
same orthologs in mammals. We have constructed classification of these sequences available for
further funtional study of CpG-islands. We have shown that conserved CpG-clusters are
frequently methylated in germline and also are enriched with CNG motifs. The understanding of
evolutionary stability of such islands needs further investigations.
[1]. Bird AP: CpG islands as gene markers in the vertebrate
nucleus. Trends Genet 1987, 3:342-347.
[2]. Gardiner-Garden M, Frommer M: CpG islands in vertebrate
genomes. J Mol Biol 1987, 196:261-282.
[3]. Takai D, Jones PA: Comprehensive analysis of CpG islands in
human chromosomes 21 and 22. Proc Natl Acad Sci USA 2002,
99:3740-3745.
[4]. Hackenberg M, Previti C, Luque-Escamilla PL, Carpena P, MartinezAroza J, Oliver JL: CpGcluster: a distance-based algorithm for
CpG-island detection. BMC Bioinformatics 2006, 7:446.
[5]. Han L, Su B, Li WH, Zhao Z: CpG island density and its correlations
with genomic features in mammalian genomes. Genome
Biol 2008, 9:R79.
[6]. Jiang C, Han L, Su B, Li WH, Zhao Z: Features and trend of loss
of promoter-associated CpG islands in the human and
mouse genomes. Mol Biol Evol 2007, 24:1991-2000.
Download