Structure, organization and nucleotide diversity of the mitochondrial control region and cytochrome b of southern water vole (Arvicola sapidus) ALEJANDRO CENTENO-CUADROS1 & JOSÉ A. GODOY2 1 Department of Conservation Biology, and 2Department of Integrative Ecology, Estación Biológica de Doñana, CSIC. C/ Américo Vespucio s/n. Isla de La Cartuja, 41092 Sevilla, Spain (Received 27 October 2009; revised 5 February 2010; accepted 5 February 2010) Abstract The southern water vole (Arvicola sapidus Miller, 1908) is an endangered rodent whose conservation guidelines should preserve the current genetic variability. We analyze the structure and organization of the mitochondrial control region (CR) in A. sapidus. The CR of this species is characterized by a low guanine-cytosine content, the absence of any repetitive motif within the two hypervariable regions, and the presence of the two extended termination-associated sequences and conserved sequence blocks. Nucleotide diversity comparisons between A. sapidus and the European water vole (Arvicola terrestris) revealed differences in the distribution of genetic variation. Furthermore, we provide primers for the amplification of short and highly polymorphic fragments of CR and cytochrome b especially designed for degraded materials. These markers offer molecular tools to assist in the establishment of future conservation and management guidelines, and will also facilitate studies at different spatial and evolutionary scales of this species. Keywords: Conserved sequence blocks, cytochrome b, D-loop, extended termination-associated sequences, Muridae, Rodentia Introduction The vertebrate mitochondrial DNA (mtDNA) is a closed, circular, and maternally inherited molecule widely used in molecular ecology and phylogeographic studies (Ballard and Whitlock 2004). Among the 37 genes encoded within mtDNA, the control region (CR) and cytochrome b (Cytb) have been most frequently used at intraspecific and interspecific levels. The CR is a non-coding mitochondrial fragment, the length of which varies depending on the number of repeated motifs within the region. The CR of mammals is flanked by tRNAPro and tRNAPhe and is divided into two hypervariable domains separated by a central conserved region—but see, e.g. Matson and Baker (2001) for two hypervariable regions within the second domain in Clethrionomys. The latter is characterized by three conserved sequence boxes (CSBs) involved in regulatory signals for the processing of the RNAs that prime replication. The two hyper variable regions are mainly composed of extended termination- associated sequences (ETAS), the most rapidly evolving portion of the mitochondrial genome (Saccone et al. 1993), which are presumably related to the regulation of replication and transcription of mtDNA. Hypervariable fragments within the CR usually meet the polymorphism required to address issues related to population genetic structure or phylogeography. However, the heterogeneous distribution of sequence variation along the region, and the possible occurrence of homopolymers tracts and tandem repeats and heteroplasmy may hamper the direct sequencing of polymerase chain reaction (PCR) products. It is thus necessary to envisage a thorough characterization of the structure of this Correspondence: A. Centeno-Cuadros, Estació n Bioló gica de Doñ ana, CSIC, C/Amé rico Vespucio, s/n C.P., Sevilla 41092, Spain. Tel: 34 954466700. Fax: 34 954621125. E-mail: acenteno@ebd.csic.es region within every newly targeted taxon previous to its routine use in molecular ecology and evolutionary studies. The two species of the genus Arvicola (Mammalia, Rodentia, and Cricetidae) are distributed across Eurasia. The European water vole (Arvicola terrestris) ranges throughout eastern Asia up to the northern regions of Iberia and has been the subject of several molecular ecology studies (Stewart et al. 1998, 1999; Berthier et al. 2005; Piertney et al. 2005; Oliver and Piertney 2006). On the other hand, the southern water vole (Arvicola sapidus) is distributed through Iberia and France, and nowadays is categorized as Vulnerable by the IUCN (Rigaux et al. 2008) by habitat fragmentation, contamination of water bodies, and the introduction of the American mink (Mustela vison); molecular studies are scarce and mainly focused on its karyotype (Diaz de la Guardia and Pretel 1978, 1979; Megı́as-Nogales et al.) and has been only recently studied at the intraspecific (phylogeography) and interspecific levels (evolutionary history of Arvicola) (Centeno-Cuadros et al. 2009a,b). Conservation biologists and wildlife managers need molecular tools to go into the knowledge of historical and contemporary processes in depth and therefore help them to develop proper conservation programs of the species. To date, there are no detailed studies on CR structure and variability in water voles, partially because of homopolymer sequences that hamper the sequencing of the entire region (Piertney et al. 2005; S. Piertney, personal communication). In the present study, we characterize the organization and variability of the entire mitochondrial CR sequence in A. sapidus by designing a set of primers that circumvent sequencing problems. In the second step, we provide primers that amplify short and highly polymorphic fragments of CR and Cytb to study intraspecific variation from degraded materials and apply these primers on non-invasive samples of Southern and European water vole species. Materials and methods We ear punched 47 individuals of A. sapidus trapped (and released live) in the Natural Region of Doñ ana (southwestern Spain, 378100 N, 68230 W), which were used to describe nucleotide variation in the CR. We also obtained fresh tissue samples from distant populations in Spain (GenBank accession numbers FJ895499, FJ895500, FJ895507, and FJ895577; see Table I) in order to obtain a representative sampling of haplotypes and nucleotide polymorphism. We also used 22 non-invasive samples (see below) of A. terrestris scherman from the north of Spain in order to test the cross-amplification of CR and Cytb primers designed in this study. Fresh tissue DNA was extracted with a “salting-out” protocol (Mü llenbach et al. 1989). We also used DNA extracted from non-invasive samples— mainly bones and skins obtained from pellets of owl (Tyto alba) and eagle owl (Bubo bubo)—and from museum specimens (housed at the scientific collections of the Biological Station of Doñ ana) to test the suggested primers targeting short and polymorphic fragments of CR and Cytb (see Results and discussion). We found homopolymer tracts within domain II that hampered sequencing reactions in CR of A. sapidus. We solved this problem by using primers F15708 and R92 (Piertney et al. 2005), primer 50 -TCCCCACCATCAGCACCCAAAGC-30 designed by Stacy et al. (1997) (hereafter F15374) and four specifically designed internal primers yielding partially overlapping fragments (F15816, 50 ATGTTTTATCGTCCATACGTTCC-30 ; F15872, 50 -AATCAGCCCATGCCTAACAT-30 ; R15946, 50 TAGCCGTCAAGGCATGAAG-30 ; RCRasa 50 -AAAAACAACTCAAAATTCCAAAA-30 ). Primer3 (Rozen and Skaletsky 2000) was used to optimize the locations of these primers with the default user-set primer design parameters, targeting conserved sequence overall haplotypes and restricting the overall size of the amplified fragment to less than 250 bp. Cytb was amplified and sequenced using primers H15288 and L14115 (Martin et al. 2000). PCR amplifications were performed as follows: 948C for 5 min, 40 cycles at 928C for 30 s, 628C (CR) or 608C (Cytb) for 30 s, and 728C for 30 s, finishing with 728C for 5 min. Five microliters of mitochondrial PCR products were purified with 2 ml ExoSAP-IT enzyme (USB Corp. Cleveland, OH, USA). Sequencing reactions were performed using the BigDye Terminator Cycle Sequencing Kit v.1.1 (Applied Biosystems, Inc. Foster City, CA, USA) following the manufacturer’s instructions, and the same primers were used for the amplifications. Sequencing products were run on an Applied Biosystems 3130 £ l Genetic Analyzer. Forward and reverse sequences were edited and aligned using Sequencher 4.6 (Gene Codes Corp. Ann Arbor, MI, USA). We described the distribution of nucleotide diversity along CR using a sliding window of 100 sites (step size of 25 sites) as implemented in the software DnaSP 4.50.3 (Rozas et al. 2003). Results and discussion Herein we report the first analysis of the structure of the entire CR of a representative of the genus Arvicola (A. sapidus) (Figure 1; GenBank accession number FJ502319). By using four partially overlapping fragments, we were able to obtain complete sequences from 47 southern water voles. We identified a central region that includes two CSBs. While CSB3 was completely conserved in A. sapidus, CSB2 differed by two indels with respect to corresponding sequences in Mus musculus mitochondrial genome. We could not allocate Arvicola CSB1, indicating large divergences between M. musculus and A. sapidus sequences. However, the degree of conservation and functionality Table I. Geographic distribution, type of sample, and GenBank accession numbers for all CR (204 bp) and Cytb (208 bp) haplotypes sequenced for A. sapidus from Spain (SP), Portugal (POR), and France (FR). GenBank accession number Sample size Source Locality Latitude Longitude Huelva (SP) Huelva (SP) É vora (POR) Cá diz (SP) Sevilla (SP) Badajoz (SP) Ávila (SP) Gerona (SP) Navarra (SP) Tarragona (SP) Navarra (SP) É vora (POR) Sevilla (SP) Haute-Vienne (FR) Santander (SP) Haute-Vienne (FR) Haute-Vienne (FR) Creuse (FR) Jaén (SP) Burgos (SP) La Rioja (SP) Pyrénées-Orientales (FR) Setú bal (POR) Setú bal (POR) Auvergne (FR) Auvergne (FR) Granada (SP) Burgos (SP) 36.9876 36.9876 38.7047 36.0616 37.9337 38.8666 40.4271 41.8490 42.7539 40.8541 42.7539 38.7047 37.9337 45.8053 42.9892 45.8053 45.8053 45.8736 37.9108 42.1729 42.3269 42.8332 37.9383 37.9383 45.7344 45.7344 36.9728 42.1729 2 6.4814 2 6.4814 2 7.4000 2 5.6363 2 5.7621 2 6.4274 2 5.3032 2.3902 2 1.0998 2 1.096 2 1.0998 2 7.4000 2 5.7621 0.9337 2 3.9710 0.9337 0.9337 1.6440 2 3.0024 2 3.7077 2 3.0377 2.9191 2 8.7766 2 8.7766 2.6273 2.6273 2 3.4531 2 3.7077 Auvergne (FR) Huelva (SP) Setú bal (POR) Sevilla (SP) Creuse (FR) Haute-Vienne (FR) Granada (SP) Navarra (SP) Evora (POR) Santander (SP) Sevilla (SP) Burgos (SP) Gerona (SP) Badajoz (SP) Haute-Vienne (FR) Jaén (SP) La Rioja (SP) Pyrénées-Orientales (FR) Tarragona (SP) 45.7344 36.9876 37.9383 37.9337 45.8736 45.8053 36.9728 42.7539 38.7047 42.9892 37.9337 42.1729 41.8490 38.8666 45.8053 37.9108 42.3269 42.8332 40.8541 2.6273 2 6.4814 2 8.7766 2 5.7621 1.6440 0.9337 3.4531 2 1.0998 2 7.4000 2 3.9710 2 5.7621 2 3.7077 2.3902 2 6.4274 0.9337 2 3.0024 2 3.0377 2.9191 2 1.0960 CR FJ895495 FJ895496 FJ895497 FJ895499 FJ895500 FJ895504 FJ895507 FJ895519 1 2 1 1 1 1 1 3 FJ895521 FJ895522 FJ895530 FJ895543 1 1 1 2 FJ895544 FJ895545 FJ895547 FJ895552 FJ895555 FJ895559 FJ895563 FJ895565 FJ895566 FJ895568 FJ895569 FJ895576 FJ895577 1 1 1 1 1 1 1 1 1 1 1 1 1 Fresh tissue Fresh tissue Bones Fresh tissue Fresh tissue Bones Fresh tissue Fixed tissue Bones Bones Bones Bones Bones Bones Bones Bones Bones Bones Bones Fixed tissue Bones Bones Fresh tissue Fresh tissue Fresh tissue Fresh tissue Fixed tissue Fresh tissue 2 3 2 1 1 1 1 2 2 1 1 1 1 1 2 1 1 1 1 Fresh tissue Fresh tissue Fresh tissue Fresh tissue Bones Bones Bones Bones Bones Bones Bones Fixed tissue Fixed tissue Bones Bones Bones Bones Bones Bones Cytb FJ895410 FJ895416 FJ895459 FJ895467 FJ895474 FJ895478 FJ895488 of this sequence block has been controversial. For example, whereas Sbisa et al. (1997) reported the CSB1 as the least conserved sequence block and suggested it was the most important in terms of functionality, studies in the red-backed vole Clethrionomys (Matson and Baker 2001) showed CSB1 as a sequence block with 40% of variability. The authors eventually suggested that only a single element of CSBs and ETAS was involved in mitochondrial replication and, consequently, any variation within any additional element should not have any detrimental effect. Base composition and distances between CSBs might be variable among species but are usually conserved at the intraspecific level (Sbisa et al. 1997). The global nucleotide composition in Arvicola is AT-rich and follows the common biased base content found in other organisms—A. sapidus average percentages: A (32.4%), T (30.7%), C (23.8%), and G (13.1%) (Saccone et al. 1987; Zhang and Hewitt 1997). However, A. sapidus showed slight deviations from the expected A . T . C . G pattern of mammals in ETAS and CSBs—ETAS: T Figure 1. Structure and organization of the complete Cytb and mitochondrial CR of A. sapidus. Forward and reverse primers are indicated with an arrow above their respective position in the mitochondrial genome (a,b) and as underlined sequences (c). Numbers on the upper line (b) specify the initial position and nucleotide polymorphism (100-bp sliding window, see text) on the consensus sequences of A. sapidus obtained in this study. Numbers below the lower line (b) show the relative position on M. musculus mtDNA. The graph above the CR map (b) reveals the distribution of nucleotide diversity along the sequence. An asterisk on a nucleotide position highlights a polymorphic site among 47 individual sequences. Arrows on (c) delimit conserved blocks in CR sequences. DI, first hypervariable domain; DII, second hypervariable domain. (32.3%), A (31.5%), C (25%), and G (11.3%); CSB: C (33.3%), A (29.3%), T (25.8%), and G (11.6%). The distribution of nucleotide diversity along the CR reveals the allocation of most of the overall polymorphism ( p ¼ 0.0406, nucleotide midpoint position 150, representing 14 out of 19 total polymorphic sites) at the hypervariable domain I (sliding window analysis, Figure 1), whereas the maximum peak of nucleotide diversity in the second hypervariable domain was four times lower ( p ¼ 0.0103, midpoint position 725). Our results do not show any evidence for the existence of a third hypervariable domain within CR in A. sapidus (as suggested in Clethrionomys; see Introduction and Matson and Baker 2001). The distribution of nucleotide diversity in A. sapidus is biased toward the first hypervariable domain as in other species of rodents (e.g. Spalax galili; Reyes et al. 2003) and differs from the most commonly observed patterns where the polymorphism is divided between both hypervariable regions or even greatly biased toward the second hypervariable domain (Baker and Marshall 1997; Matson and Baker 2001; Roques et al. 2004). None of the haplotypes analyzed showed tandemly repeated sequences. Considering the observed distribution of polymorphisms and that the CSBs are flanked by homopolymer tracts (presumably involved in forming a stable hairpin structure in CR), we propose to target a 246-bp CR fragment in the first hypervariable region that contains most of the total polymorphism in A. sapidus using forward F15468 (5 0 -GCATTAAATTATATTCCCCATGC-3 0 ) and reverse R15713 (5 0 -TTGTTGGTTTCACGGAGGAT-30 ) primers. We also characterized the complete 1143-bp coding region of Cytb from six individuals from different geographic locations in Spain (GenBank accession numbers FJ539341 – FJ539346). We found five different haplotypes defined by 10 polymorphic nucleotide positions (nine synonymous and one non-synonymous mutations). In order to maximize the polymorphism amplified in a single short fragment, we suggest the targeting of a 248-bp fragment using specific primers F14559 (5 0 -TCCTTTTGAGGGGCTACAGT-30 ) and R14806 (50 -TGGAAGGGAATTTTGTCTGC-30 ). Furthermore, we applied primers F15468 and R15713 (CR), and F14559 and R14806 (Cytb) to non-invasive and museum samples (see Materials and methods) of A. sapidus (n ¼ 26) (Table I) and A. t. scherman (n ¼ 22) (Northern Spain) in order to check the amplification success and usefulness of these two polymorphic fragments on degraded genetic material. Nucleotide diversity in Cytb was lower in A. sapidus ( p ¼ 0.00374) than in A. terrestris ( p ¼ 0.00912), whereas in CR it was higher in A. sapidus ( p ¼ 0.03623) than in A. terrestris ( p ¼ 0.0077). Comparison of CR polymorphism between species must be considered cautiously, because European water voles apparently assemble higher nucleotide diversity in the second hypervariable domain, a region that showed scarce variation in A. sapidus. Following a thorough characterization of the structure and nucleotide variation in A. sapidus mitochondrial CR, we propose the first hypervariable domain in A. sapidus as the most informative mitochondrial marker for addressing studies at both intraspecific and interspecific scales. We report primers for the amplification of highly variable yet short fragments of both CR and Cytb (246 and 248 bp, respectively) that are especially useful for highly degraded genetic material, such as museum specimens and ancient DNA. Developed primers should prove useful for studies addressing evolutionar y and population genetic issues on A. sapidus (and probably in closely related species) upon which future conservation and management guidelines should be based, and provided an efficient and cost-effective tool for species identification from materials such as feces or unidentifiable bones obtained in raptor pellets. Acknowledgments The authors are especially grateful to J. M. Llanes, M. Gutiérrez, M. Gonzá lez, F. Alda, J. Romá n, M. Delibes, the Molecular Ecology Laboratory and scientific collections housed at the Biological Station of Doñ ana. They are also indebted to a long list of biologists who provided samples for this study. The present work was funded by the Direcció n General de Investigació n (project BOS2001-2391-C02-01). A. C.-C. was funded by the Spanish Ministry of Education and Science. Declaration of interest: The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper. References Baker AJ, Marshall HD. 1997. Mitochondrial control region sequences as tools for understanding evolution. In: Mindell DP, editor. Avian molecular evolution and systematics. Michigan: Academic Press. p 51 – 82. Ballard JWO, Whitlock MC. 2004. The incomplete natural history of mitochondria. Mol Ecol 13:729 – 744. Berthier K, Galan M, Foltête JC, Charbonnel N, Cosson JF. 2005. Genetic structure of the cyclic fossorial water vole (Arvicola terrestris): Landscape and demographic influences. Mol Ecol 14: 2861 – 2871. Centeno-Cuadros A, Delibes M, Godoy JA. 2009a. Dating the divergence between Southern and European water voles using molecular coalescent-based methods. J Zoolog 279: 404 – 409. Centeno-Cuadros A, Delibes M, Godoy JA. 2009b. Phylogeography of Southern Water Vole (Arvicola sapidus): Evidence for refugia within the Iberian glacial refugium? Mol Ecol 18: 3652 – 3667. Diaz de la Guardia R, Pretel A. 1978. Karyotype and centric dissociation in water vole Arvicola sapidus spp sapidus Miller 1908 (Rodentia, Muridae). Experientia 34:706 – 708. Dı́az de la Guardia R, Pretel A. 1979. Comparative study of the karyotypes of two species of water vole: Arvicola sapidus and Arvicola terrestris (Rodentia, Microtinae). Caryologia 32: 183 – 188. Martin Y, Gerlach G, Schlotterer C, Meyer A. 2000. Molecular phylogeny of European muroid rodents based on complete cytochrome b sequences. Mol Phylogenet Evol 16:37 – 47. Matson CW, Baker RJ. 2001. DNA sequence variation in the mitochondrial control region of red-backed voles (Clethrionomys). Mol Biol Evol 18:1494 – 1501. Megı́as-Nogales B, Marchal JA, Acosta MJ, Bullejos M, Dı́az de la Guardia R, Sá nchez A. 2003. Sex chromosome pairing in two Arvicolidae species: Microtus nivalis and Ar vicola sapidus. Hereditas 138:114 – 121. Mü llenbach R, Lagoda PJL, Welter C. 1989. An efficient saltchloroform extraction of DNA from blood and tissues. Trends Genet 5:391 – 391. Oliver MK, Piertney SB. 2006. Isolation and characterization of a MHC class II DRB locus in the European water vole (Arvicola terrestris). Immunogenetics 58:390 – 395. Piertney SB, Stewart WA, Lambin X, Telfer S, Aars J, Dallas JF. 2005. Phylogeographic structure and postglacial evolutionary history of water voles (Arvicola terrestris) in the United Kingdom. Mol Ecol 14:1435 – 1444. Reyes A, Nevo E, Saccone C. 2003. DNA sequence variation in the mitochondrial control region of subterranean mole rats, Spalax ehrenbergi superspecies, in Israel. Mol Biol Evol 20: 622 – 632. Rigaux P, Vaslin M, Noblet JF, Amori G, Muñ oz LJP. 2008. Arvicola sapidus. In: IUCN 2009. IUCN Red List of Threatened Species. Version 2009.2. , www.iucnredlist.org. . Downloaded on (for instance, the date of acceptance of the manuscript). More info at http://www.iucnredlist.org/apps/ redlist/details/2150/0 Roques S, Godoy JA, Negro JJ, Hiraldo F. 2004. Organization and variation of the mitochondrial control region in two vulture species, Gypaetus barbatus and Neophron percnopterus. J Hered 95: 332 – 337. Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R. 2003. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19:2496 – 2497. Rozen S, Skaletsky HJ. 2000. Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S, Misener S, editors. Bioinformatics methods and protocols: Methods in molecular biology. Totowa, NJ: Humana Press. p 365 – 386. Saccone C, Attimonelli M, Sbisa E. 1987. Structural elements highly preserved during the evolution of the D-loop-containing region in vertebrate mitochondrial DNA. J Mol Evol 26: 205 – 211. Saccone C, Lanave C, Pesole G, Sbisa E. 1993. Peculiar features and evolution of mitochondrial genome in mammals. DiMauro S, Wallace DC, editors. Mitochondrial DNA in human pathology. New York: Raven Press. p 27 – 37. Sbisa E, Tanzariello F, Reyes A, Pesole G, Saccone C. 1997. Mammalian mitochondrial D-loop region structural analysis: Identification of new conserved sequences and their functional and evolutionary implications. Gene 205:125 – 140. Stacy JE, Jorde PE, Steen H, Ims RA, Purvis A, Jakobsen KS. 1997. Lack of concordance between mtDNA gene flow and population density fluctuations in the bank vole. Mol Ecol 6:751 – 759. Stewart WA, Piertney SB, Dallas JF. 1998. Isolation and characterization of highly polymorphic microsatellites in the water vole, Arvicola terrestris. Mol Ecol 7:1258 – 1259. Stewart WA, Dallas JF, Piertney SB, Marshall F, Lambin X, Telfer S. 1999. Metapopulation genetic structure in the water vole, Arvicola terrestris, in NE Scotland. Biol J Linnean 68:159 – 171. Zhang D, Hewitt G. 1997. Insect mitochondrial control region: A review of its structure, evolution and usefulness in evolutionary studies. Biochem Syst Ecol 25:99 – 120.