J. Zool. Syst. Evol. Research 42 (2004) 215–222 Ó 2004 Blackwell Verlag, Berlin ISSN 0947–5745 Received on 25 May 2004 1 Section for Zoology, Natural History Museums and Botanical Garden, University of Oslo, Oslo, Norway; 2Division of General Human Genetics, Institute of Anthropology and Human Genetics, University of Tübingen, Tübingen, Germany Allelic variation, fragment length analyses and population genetic models: a case study on Drosophila microsatellites* L. Bachmann1, P. Bareiß1,2 and J. Tomiuk2 Abstract The allelic variation of 16 microsatellite loci from selected species of the Drosophila melanogaster and D. obscura group was determined. Intraand interspecific sequence comparisons allowed discrimination of mutations affecting the repetitive microsatellite from those affecting the flanking regions. The hypotheses that slippage needs a minimum number of repeats in order to become efficient with respect to microsatellite variability, and of an increased mutation rate with increased length of the microsatellite are supported by the results of our analyses. There is in particular at the interrupted complex microsatellite locus BICOID in the species of the D. obscura group, extensive variation in the flanking regions in addition to length and sequence variation of the repetitive microsatellite. The allelic variation at this locus can hardly be explained by slippage alone. Estimates of microsatellite variability by fragment length analyses will pick up only a minor fraction of allelic variation at such loci, and conclusions that are based on the stepwise mutation model will not hold. Key words: Drosophila melanogaster group – Drosophila obscura group – microsatellites – repetitive DNA – sequence analysis – slippage replication Introduction Microsatellites or simple sequence repeats are tandemly repeated 1–6 bp long sequence motifs (Tautz and Schlötterer 1994; Goldstein and Pollock 1997; Li et al. 2002). Homopolymers, however, are excluded in some definitions (Chambers and MacAvoy 2000). According to these authors, microsatellites can be grouped into six categories. (1) Microsatellites can consist of identical repeats (perfect microsatellites). (2) Few point mutations can interrupt the perfect microsatellites (interrupted microsatellites). (3) Microsatellites can consist of two adjacent perfect microsatellites with different motifs (compound microsatellites). (4) The two repetitive sequences of compound microsatellites can be interrupted by a short non-repetitive sequence (interrupted compound microsatellites). (5) Complex microsatellites can contain several different perfect repetitive sequences (complex microsatellites). (6) Complex microsatellites can be interrupted by non-repetitive sequences (interrupted complex microsatellites). Microsatellites are more or less randomly dispersed in the genomes of all eukaryotes although there are differences in structure, length and number of microsatellites between species (Bachtrog et al. 1999; Chambers and MacAvoy 2000; Harr and Schlötterer 2000). Microsatellites occur more frequently and are longer in vertebrates than in invertebrates. In Drosophila, microsatellites are relatively short and most of them are dinucleotide repeats (66%), followed by trinucleotide repeats (30%) and a small fraction of tetranucleotide repeats (4%) (Schug et al. 1998). The mode of evolution of microsatellites has been discussed in particular with respect to their mutational instability. Compared with other loci, the mutation rate of microsatellite loci seems to be high with 10)2–10)6 per locus and generation, for example, 10)4–10)6 for Drosophila melanogaster (Schlötterer et al. 1998) and 10)3–10)5 in mammals (Schug et al. 1997). Studies of Drosophila species (Goldstein and Clark 1995) and yeast (Wierdl et al. 1997) indicate that the mutation rate of microsatellites is positively correlated with the number of repeats but the impact of the length of microsatellite motifs is still discussed. A study of human families revealed, for example, a four times higher mutation rate for tetranucleotide than dinucleotide repeats (Weber and Wong 1993), whereas Chakraborty et al. (1997) observed in a population study, a two times higher rate for dinucleotide microsatellites. There are two major mutational processes that can change the number of repeats of microsatellites: (1) Slippage replication, a mispairing of the matrix and replicated strands during DNA replication (Levinson and Gutman 1987), and (2) recombination processes such as unequal crossing-over and gene conversion (Garza et al. 1995). Slippage is considered the dominant process for the generation of microsatellite variability (Schlötterer 2000) and it is assumed that slippage needs a minimum number of repeats in order to become efficient with respect to microsatellite variability. This implies that mutation processes different than slippage must first extend short repetitive sequences (protomicrosatellites), before slippage can generate microsatellite variability (Rose and Falush 1998; Lai and Sun 2003). According to theoretical analyses (e.g. Stephan and Kim 1998), slippage becomes efficient when protomicrosatellites consist of more than five repeats. However, our understanding of the evolution of protomicrosatellites is far from being complete. Various models have been put forward in order to explain allelic variability of microsatellite loci. Parameters such as increasing mutation rates with increasing repeat numbers, limited length of microsatellites, higher probability for microsatellite expansion than for reduction, or uncoupling of the slippage by interruptions of perfect repetitive sequences through point mutations (e.g. Zhivotovsky et al. 1997; Kruglyak et al. 1998; Sibly et al. 2001) were used to modify simple stepwise mutation models (Otha and Kimura 1973). However, to date, there is no model available that can perfectly predict the allelic variation observed in experimental or natural *Dedicated to Prof. Ernst Mayr on the occasion of his 100th birthday. U.S. Copyright Clearance Center Code Statement: 0947–5745/04/4203–0215$15.00/0 www.blackwell-synergy.com 216 populations (e.g. Colson and Goldstein 1999). Nevertheless, statistical tools that are based on stepwise mutation models were developed to analyse population structures by means of microsatellite variation (Goldstein et al. 1995a,b; Slatkin 1995; Pritchard and Feldman 1996). However, processes creating microsatellite variation are more complex and simple stepwise mutation models should not be applied easily (Colson and Goldstein 1999). Coulson et al. (1998) proposed an individual heterogeneity measure d2 that is also based on the number of repeats and can easily be expanded to a grand mean over groups and populations. Microsatellites are frequently used markers for analysing various problems in biology. The high variability along with almost fully automated analyses techniques allows a rapid screening of large sample sizes. Microsatellites are efficient tools for the identification of individuals and for the assessment of individualsÕ relationships (Chambers and MacAvoy 2000). In almost all studies conducted at present, microsatellite variation is determined by automated fragment length analysis. The length of microsatellites (usually PCR products) is, therefore, the only criterion for the identification of alleles and subsequent statistical analyses may be affected by this experimental approach. Fragment length analysis cannot discriminate length variation of the microsatellite and length variation of the flanking regions that are amplified together with the microsatellite. Allelic diversity of microsatellite loci may, therefore, be underestimated. In this study, we analyse sequence data of microsatellites in coding and non-coding regions of selected Drosophila species. We determined the sequence variation of microsatellites and their flanking sequences and the impact of the most likely mutation processes in order to explain the observed variation of the respective microsatellite loci. Material and methods Database search for microsatellite sequences GenBank and personal websites, such as, http://www.mbg.cornell.edu/ aquadro/aquadrolab.html of C. Aquadro (Cornell University Ithaca, NY, USA), were searched for microsatellite sequences that allowed an intra- and interspecific comparison. The search was guided by three criteria: (1) Sequence data must be available for closely related species. (2) Several microsatellite loci must be described for these species. (3) Sequence data of several microsatellite alleles must be available at least in one species. Experimental studies of microsatellite sequences Microsatellites previously described for D. pseudoobscura were amplified and sequenced from individuals of laboratory cultures of eight species of the D. obscura group. They are: D. miranda, D. obscura, D. ambigua, D. tristis, D. bifasciata, D. subobscura, D. madeirensis and D. guanche. The microsatellites BICOID and DPSX006 were characterized for individuals from natural populations of D. subobscura, D. obscura, and D. helvetica that were collected in Tübingen, Germany, during summer 2003 by D. Sperlich. DNA isolation, PCR amplification of microsatellites and sequencing The DNA from Drosophila individuals was isolated according to the manual of the ÔDNA Isolation KitÕ (Gentra Systems, Minneapolis, MN, USA). Cloned Pfu DNA polymerase (Stratagene, La Jolla, CA, USA) was used for the PCR-based amplification of microsatellites according to the supplier’s instructions. PCR products were Bachmann, Bareiß and Tomiuk subsequently purefied by means of the QIAquick PCR purification kit (Qiagen, Hilden, Germany) and sequenced on an ABI 3100 automatic sequencer (Applied Biosystems, Foster City, CA, USA) according to the chain termination method (Sanger et al. (1977) using Big Dye chemistry (Applied Biosystems). The microsatellite loci DPS2003, DPS2004, DPS2005, DPS2007, DPS4001, DPSX001, DPSX003, DPSX006, DPSX009, RUNT, TROP1, E74A, and BICOID were tested using primers described by Noor et al. (2000). Results To our surprise sequences of only 15 microsatellite loci that met the selection criteria could be retrieved from databases. Eleven loci occur in species of the D. melanogaster group and four in species of the D. obscura group. Eleven of them had a dinucleotide, three loci a trinucleotide, and one locus a tetranucleotide motif (see Tables 1 and 2). The majority of microsatellite sequences from D. melanogaster group species were published by Colson and Goldstein (1999). The following sequences were retrieved (locus: accession number): RHOb: AF067878-AF067882; ABDB: AF067865-AF067868, L07835, X51663; U1951: AF067911-AF067914, X53543; DSRC: AF067908-AF067910, AC010665; NANOS: AF067873AF067876, AY075406, AE003725, M72421; SIDNA: AF067892-AF067895, X79340; CLONE: AF067883AF067886, AE003710; DME2910: AJ271561, AJ291063, DMA246211, DME291016, DME29104, DME291052, DME291055, DME291056, DME291058, DME291061, DME291064, DME291066, DME291071, DME291072, DME291088, DME291092, DME291096, DME291100, DME291103, DME291105, DME291107, DME291110, DME291111, DME291113, DSE246210, DSI246209; HOX: AF067869-AF067872; SIMA: AF067928-AF0679233, U43090; EHAB: AF067924, AF067934, AF067935, X72303; DPS4002: AF320181, AF320182, AF320185, AF320187AF320189, AF450831, AF450833, AF450835-AF450840, AF450865, AF450869; DECPENT: AY012610-AY012617; DPSX006: AF157573; DPSX010: AF320138, AF320152, AF320154, AF320166, AF450643, AF450644, AF450648, AF450649, AF450656, AF450657, AF450659, AF450662, AF450665, AF450666, AF450672; BICOID: AF450963, AF450964, AF450949, AF450954, AF450955, AF450962. In order to extend the data set, we attempted to amplify the microsatellite loci DPS2003, DPS2004, DPS2005, DPS2007, DPS4001, DPSX001, DPSX003, DPSX006, DPSX009, RUNT, TROP1, E74A, and BICOID in various species of the D. obscura group using the primers described by Noor et al. (2000). These primers have been developed in order to amplify the respective microsatellites in D. pseudoobscura. They all amplified homologous microsatellites in D. miranda, but only the two loci DPSX006 and BICOID could be amplified from all D. obscura group species tested. The compiled data set was subsequently analysed for sequence variability, i.e. the number of tandemly repeated motifs, length variation of the microsatellite, length variation of the flanking region, and sequence variation of the flanking region. Furthermore, we estimated the ratio of the number of alleles and length variants detectable by automated fragment length analyses (Tables 1 and 2). The variability of microsatellites cannot, as already observed by Colson and Goldstein (1999), be assigned exclusively to the variation of the number of tandemly repeated motifs. In addition, most loci show 10–26 10, 7 + 2 10/9 + 2 8–11 + 2 8–11 + 2 Number of repeats D. melanogaster D. simulans D. sechellia + ) + ) Length variation of flanking regions D. melanogaster D. simulans + D. sechellia ) Sequence variation in flanking regions D. melanogaster ) D. simulans ) D. sechellia ) 3 1 1 + n.a. n.a. + n.a. n.a. + n.a. n.a. 1 According to the definitions given in the introduction. n.a., not applicable. Number of alleles detectable with fragment analysis D. melanogaster 2 4 D. simulans 2 1 D. sechellia 2 + ) Length variation of microsatellite D. melanogaster + D. simulans + D. sechellia + 6 6 2–17 1 1 1 1 1, 2 1, 2 2 2 AT Microsatellite type1 D. melanogaster D. simulans D. sechellia CA U1951 3 1 1 AC Motif ABDB Number of different alleles available from data bases D. melanogaster 2 4 D. simulans 2 2 D. sechellia 2 RHOb Locus 2 1 1 ) n.a. n.a. ) n.a. n.a. + n.a. n.a. 5+3 5+3 6–7 + 3 2 2 2 2 1 1 TA DSRC 4 2 1 ) ) n.a. + ) n.a. + + n.a. 6, 7 6 8–21 1 1 1 4 2 1 TA NANOS 2 2 1 ) ) n.a. + + n.a. ) ) n.a. 5 5 5 1 1 1 2 2 1 GC SIDNA 2 2 1 + + n.a. ) ) n.a. ) + n.a. 8, 9 10 10 1 1 1 2 2 1 GT CLONE 15 2 1 + ) n.a. + + n.a. + ) n.a. 6–13, 5+(4–11), 5+(7–9)+9, 5 + 2 + 5, 5 + 3 + 4 5+4+3 5+4+3 1, 2 2 2 22 2 1 GT DME2910 1 2 1 n.a. ) n.a. n.a. ) n.a. n.a. + n.a. 3, 4 4 5 1 1 1 1 2 1 CAG HOX 3 2 1 + ) n.a. ) ) n.a. + + n.a. 5 + 2, 6 + 3 5+3 7, 5 + 3, 8 + 3 1, 2 2 2 4 2 1 CAG SIMA 1 2 1 n.a. ) n.a. n.a. ) n.a. n.a. + n.a. 4, 5 5 6 1 1 1 1 2 1 AGCC EHAB Table 1. Sequence variation of eleven microsatellite loci from species of the Drosophila melanogaster group. Sequences were retrieved from databases. Only sequences of different alleles were compiled and analysed Allelic variation of Drosophila microsatellites 217 218 Bachmann, Bareiß and Tomiuk Table 2. Sequence variation of eleven microsatellite loci from species of the Drosophila obscura group. Sequences were either retrieved from databases or determined in this study. Only sequences of different alleles were compiled and analysed Locus DPS4002 DECPENT DPSX010 DPSX06 BICOID Motif CA CA TG TG CAG 12 1 1 1 15 4 2 1 1 1 1 2 1 1 8 5 1 1 1 15 1 1 3 4 4 6 6 6 4 4 4 4 4 4 4 4 6 6 6 6 6 6 6 6 6 + 13 5+7 (4–5)+(0–3)+(1–2)+(1–2)+3 (0–4)+3 + 2 + 2 + 3 5+3+2+2+3 4+1 4+1 4+1 4+1 4+1 (1 + 2)+1 4+1 (8–15)+1 3 + 2 + 2 + 2+(3–5) n.a. n.a. n.a. 3+(1–2)+2+(2–7)+(0–5) n.a. n.a. 4+5+2+3+3 n.a. n.a. + + ) ) n.a. n.a. n.a. ) n.a. n.a. + + n.a. n.a. n.a. + n.a. n.a. ) n.a. n.a. + + ) ) n.a. n.a. n.a. ) n.a. n.a. + ) n.a. n.a. n.a. + n.a. n.a. ) Number of different alleles available from databases D. pseudoobscura 10 3 D. persimilis 4 2 D. miranda 1 1 D. affinis 1 D. obscura D. ambigua D. tristis D. bifasciata D. subobscura D. madeirensis D. guanche D. helvetica Microsatellite type1 D. pseudoobscura D. persimilis D. miranda D. affinis D. obscura D. ambigua D. tristis D. bifasciata D. subobscura D. madeirensis D. guanche D. helvetica Number of repeats D. pseudoobscura D. persimilis D. miranda D. affinis D. obscura D. ambigua D. tristis D. bifasciata D. subobscura D. madeirensis D. guanche D. helvetica 1, 2 1, 2 1 11–17, 4 + 8 8–10, 6 + 8 9 Length variation of microsatellite D. pseudoobscura + D. persimilis + D. miranda n.a. D. affinis D. obscura D. ambigua D. tristis D. bifasciata D. subobscura D. madeirensis D. guanche D. helvetica Length variation of flanking regions D. pseudoobscura ) D. persimilis ) D. miranda n.a. D. affinis D. obscura D. ambigua D. tristis D. bifasciata D. subobscura D. madeirensis D. guanche D. helvetica 1 1 1 1 5–9 7, 8 5 2 + + n.a. n.a. + ) n.a. n.a. 1, 2 1 7–19, 8 + 4 5 + n.a. ) n.a. Allelic variation of Drosophila microsatellites 219 Table 2. (Continued) Locus DPS4002 Sequence variation in flanking regions D. pseudoobscura + D. persimilis ) D. miranda n.a. D. affinis D. obscura D. ambigua D. tristis D. bifasciata D. subobscura D. madeirensis D. guanche D. helvetica Number of alleles detectable with fragment analysis D. pseudoobscura 7 D. persimilis 4 D. miranda 1 D. affinis D. obscura D. ambigua D. tristis D. bifasciata D. subobscura D. madeirensis D. guanche D. helvetica DECPENT DPSX010 DPSX06 BICOID ) ) n.a. n.a. + n.a. n.a. n.a. + + + ) n.a. n.a. n.a. + n.a. n.a. + + n.a. n.a. n.a. + n.a. n.a. + 1 1 3 1 1 1 1 1 1 1 1 1 6 3 1 1 1 4 1 1 1 3 2 1 1 10 1 1 According to the definitions given in the introduction. n.a., not applicable. variation in sequence and/or length of the flanking regions. In 16 of 20 possible intraspecific sequence comparisons in the D. melanogaster group species, the allelic variation of microsatellites is caused by varying numbers of the repeated motifs whereas length and sequence variation in the flanking regions occur six and five times, respectively (Table 1). There is a similar pattern for the microsatellite loci in of the D. obscura group species. Of the 15 possible intraspecific comparisons, variable numbers of the microsatellite motif was observed 10 times, and length and sequence variation of the flanking region six and 11 times, respectively (Table 2). Thus, in many instances, alleles are defined by mutations affecting the flanking region rather than the tandemly repeated microsatellite motif. In the compiled data set, estimates of the number of alleles by sequence analysis and fragment length analysis are fairly consistent for most loci, i.e. most alleles can be detected by both methods. However, this is most likely the result of the low number of sequences (alleles) available for the majority of species and loci. In those instances with a higher number of different sequences available (more than five alleles), i.e. D. melanogaster (DME 2910), D. pseudoobscura (DPS4002, DPSX010, BICOID), D. obscura (BICOID), D. subobscura (BICOID) and D. helvetica (DPSX06), the numbers of alleles detectable by sequence and fragment length analyses, respectively, can differ substantially (Fig. 1). There are, for example, 15 known alleles for the locus BICOID in D. subobscura but only four length variants can be distinguished. In D. pseudoobscura, only three of 15 alleles can be distinguished by fragment length analysis. Thus, many alleles remain undetectable for fragment length analysis methods. Discussion The enormous variability that may be observed at microsatellite loci (Tautz and Renz 1984; Litt and Luty 1989; Tautz 1993) make them suitable genetic markers for the characterization of population structures, linkage studies in human genetics, animal and plant breeding as well as the identification of individuals and family analyses. In many studies of allelic variation of microsatellites, alleles are simply characterized by their length, and fragment length is assumed to correlate closely with the number of repeats. The stepwise mutation model (Otha and Kimura 1973) and its modifications (e.g. Orti et al. 1997; Zhivotovsky et al. 1997; Kruglyak et al. 1998; Sibly et al. 2001) are considered to explain better the evolutionary changes of microsatellite structures than classical finite mutation models (see, e.g. Hartl and Clark 1997). Basic stepwise mutation models assume that an allele has the same probability to mutate to a longer or shorter state, whereas modified stepwise mutation models limit the number of repeats (e.g. Falush and Iwasa 1999) and assume length-dependent mutation rates. It has also been proposed that interruptions of the tandemly repeated motif through point mutations increase the mutational stability of a microsatellite region (Kruglyak et al. 1998), i.e. slippage processes become more unlikely. The genesis of the repetitive structure is of particular interest for an understanding of the evolutionary dynamics of microsatellite loci. It has been suggested that a minimum number of repeats is necessary before slippage can operate efficiently (>4 repeats: Lai and Sun 2003; >5 repeats: Sibly et al. 2001; >8 repeats: Rose and Falush 1998). The inter- and intraspecific comparison of the microsatellite locus DPSX006 in species of the D. obscura group supports this threshold hypothesis. 220 Bachmann, Bareiß and Tomiuk DPSX06 DME2910 25 Alleles Length variants 20 15 10 5 0 D. melanogaster 12 D. helvetica DPS4002 8 6 4 ra s ob o d eu s D. subobscura 10 s er p D. ira D. m D. a ur ur nd m si a a a ili cu p D. D. simulans BICOID 16 14 12 10 8 6 4 2 0 9 8 7 6 5 4 3 2 1 0 c bs c bs bo o D u .s tic e lv D. he 2 0 D. pseudoobscura D. persimilis Fig. 1. Number of alleles as determined by sequencing (alleles) versus number of alleles as defined by fragment length analysis (length variants) at the microsatellite loci DME2910, DPSX06, BICOID and DPS4002. Only Drosophila species with two or more sequenced alleles are included DPSX006, an interrupted compound microsatellites consisting of at least two stretches of tandemly repeated TG motifs, shows intraspecifically high variability (8–15 repeats) of one repetitive stretch in D. helvetica whereas there is no variation of this stretch with respect to repeat number in the other European D. obscura group species studied. These data support the two-phase mutation model of di Rienzo et al. (1994). The model assumes that a protomicrosatellite has to evolve first through point mutations that randomly increase the number of tandemly repeated sequence motifs before a multiple-step process can change the length of the repeated array by several units. Such enlarged repetitive fragments have an increased probability of being disturbed by point mutations. As a consequence, slippage rates might be reduced (Schug et al. 1998). However, slippage can also remove point mutations and, thus, create again a perfect repetitive pattern (Harr et al. 2000). Such a process may explain the variation observed at the microsatellite BICOID in D. subobscura. In this species, the last two repeated CAG stretches are homogenized in three of 15 alleles. In recent years, there has been substantial effort in order to develop statistical methods for the analysis of population divergence (Goldstein et al. 1995a,b; Slatkin 1995; Nauta and Weissing 1996; Pritchard and Feldman 1996; Feldman et al. 1997) and genetic heterogeneity of individuals and populations (Coulson et al. 1998) based on microsatellite data. It is believed that the evolution of microsatellite variation is complex and cannot be explained by simple stepwise mutation models (Colson and Goldstein 1999, for review see Li et al. 2002). Various mutational events affect microsatellite structures and the mutation rate of microsatellites is expected to correlate positively with the number of repeats. In addition, a mutation bias towards increased numbers of repeats has been observed (e.g. Harr and Schlötterer 2000; Vigouroux et al. 2002). This likelihood for an ascertainment bias has been challenged. Primmer and Ellegren (1998), for example, did not find any indication for a mutational length bias. Thus, it is difficult or even impossible to develop a realistic model for the evolutionary dynamics of microsatellite structures. It is therefore not surprising that the results of several experiments contradict conclusions drawn on the basis of basic stepwise mutation models (Angers and Bernatchez 1997; Orti et al. 1997; Colson and Goldstein 1999). More recent studies (e.g. Tsitrone et al. 2001) showed theoretically that the classical degree of heterozygosity H might be a more reliable measure for the association between heterozygosity and fitness of individuals or populations than the diversity measure d2 (Coulson et al. 1998) that is based on discrete changes of repeat numbers. The substantial variation in the flanking regions of the microsatellite loci DME2910 in D. melanogaster and BICOID in D. pseudoobscura and D. obscura supports this point of view. Apart from our currently limited understanding of the evolutionary dynamics of microsatellites, technical obstacles may affect conclusions drawn from the analyses of microsatellite variation. Primers used for the amplification of microsatellite loci do not exclusively amplify the tandemly repeated motifs but to various extent the flanking regions as well. The subsequent fragment analyses, which is the most frequently applied technique for the assessment of microsatellite variation, cannot discriminate mutations affecting the repeated region and mutations affecting the flanking regions, and base substitutions cannot be detected at all. Thus, allelic diversity of loci may be substantially underestimated. However, the extent of such underestimation is difficult to assess, because sequence data are limited. As can be seen in our compiled list (Tables 1 and 2), there are rarely more than two microsatellite sequences Allelic variation of Drosophila microsatellites per species available. We estimated the ratio of alleles and alleles detectable by fragment analyses for the most comprehensive data sets of the four loci DME2910, BICOID, DPSX06 and DPS4002 in those species with at least two alleles sequenced (Fig. 1). It turned out that the majority of DME2910, DPSX06 and DPS4002 alleles can be detected by fragment analyses. However, in the instance of BICOID, a substantial fraction of microsatellite variation (73.3–80%) would remain undetected. Although the data set does not allow for an extensive analysis, the discrepancy of the numbers of alleles and length fragments may be related to the complex structure of the microsatellite and the relatively low number of repeated motifs. The loci DME2910, DPSX06 and DPS4002 with better congruence of both estimates have a more simple structure (i.e. perfect or interrupted microsatellites) and higher numbers of repeated motifs. In these instances, mutation through slippage is most important. In instances of interrupted complex microsatellites such as BICOID, slippage contributes much less to the generation of allelic variation and estimates obtained by fragment length analyses will not meet any of the current models on the mode of evolution of microsatellites. Acknowledgements We thank D. Sperlich for collecting D. subobscura, D. obscura, and D. helvetica. We thank H. Esmer for his valuable technical assistance in the laboratory and W. D. Braun for assistance in the database searches. This work was supported by a grant from the fortüneprogramm of the Universitätsklinikum Tübingen (project no. 1189-00), PB and JT were supported by a travel grant of the German Academic Exchange Service DAAD 13/PPP-N1, and LB was supported by a grant from the Research Council of Norway (National Centre for Biosystematics, 146515/420). Zusammenfassung Allelische Variabilität, Fragmentlängen-Analysen und populationsgenetische Modelle: Eine Fallstudie an Mikrosatelliten von Drosophila Die allelische Variabilität von 16 Mikrosatelliten-Loci von Arten der Drosophila melanogaster und D. obscura Gruppe wurde auf Sequenzebene analysiert. Durch den inner- und zwischenartlichen Sequenzvergleich konnten Mutationen in den flankierenden Bereichen der Mikrosatelliten von Mutationen in den repetitiven Bereichen unterschieden werden. Die Ergebnisse erlauben Rückschlüsse auf Mutationsmechanismen, die die Variabilität von Mikrosatelliten bestimmen. Die Hypothesen, dass repetitive Mikrosatellitenbereiche eine gewisse Mindestanzahl an Repeats benötigen, um ausgepräge Längenvariation durch Slippage zu zeigen, und dass die Mutationsrate bei längeren Mikrosatelliten höher ist als bei kurzen, werden durch die zwischenartlichen Sequenzergleiche gestützt. Neben der Längen- und Sequenzvariation der repetitiven Mikrosatellitenbereiche findet sich vor allem bei dem komplexen unterbrochenen Mikrosatellitenlocus BICOID in den Arten der D. obscura Gruppe ausgeprägte Variabilität in den flankierenden Regionen. Die allelische Variabilität an diesem Locus lässt sich nicht allein durch Slippage erklären. Die Bestimmung der Mikrosatellitenvariation durch Fragmentlängenanalyse kann an solchen Loci nur einen geringen Anteil der vorhandenen Variation erfassen und Schlussfolgerungen, die auf einem stepwise mutation model basieren, haben eine nur eingeschränkte Gültigkeit. References Angers, B.; Bernatchez, L., 1997: Complex evolution of salmonid microsatellite locus and its consequences in inferring allelic divergence from size variation. Mol. Biol. Evol. 14, 230–238. 221 Bachtrog, D.; Weiss, S.; Zangerl, B.; Brem, G.; Schlötterer, C., 1999: Distribution of dinucleotide microsatellites in the Drosophila melanogaster genome. Mol. Biol. Evol. 16, 602–610. Chakraborty, R.; Kimmel, M.; Stivers, D.; Davison, L.; Deka, R., 1997: Relative mutation rates at di-, tri-, and tetranucleotide microsatellite loci. Proc. Natl Acad. Sci. USA 94, 1041–1046. Chambers, G. K.; MacAvoy, E. S., 2000: Microsatellites: consensus and controversy. Comp. Biochem. Physiol. B 126, 455–476. Colson, I.; Goldstein, D. B., 1999: Evidence for complex mutations at microsatellite loci in Drosophila. Genetics 152, 617–627. Coulson, T. N.; Pemberton, J. M.; Albon, S. D.; Beaumont, M.; Marshall, T. C.; Slate, J.; Guiness, F. E.; Clutton-Brock, T. H., 1998: Microsatellites reveal heterosis in red deer. Proc. R. Soc. Lond. B 265, 489–495. Falush, D.; Iwasa, Y., 1999: Size-dependent mutability and microsatellite constraints. Mol. Biol. Evol. 16, 960–966. Feldman, M. W.; Bergman, A.; Pollock, D. D.; Goldstein, D. B., 1997: Microsatellite genetic distances with range constraints: analytic description and problems of estimation. Genetics 145, 207–216. Garza, J. C.; Freimer, M.; Freimer, N. B., 1995: Microsatellite allele frequencies in humans and chimpanzees, with implications for constraints on allele size. Mol. Biol. Evol. 12, 594–603. Goldstein, D. B.; Clark, A. G., 1995: Microsatellite variation in North American populations of Drosophila melanogaster. Nucleic Acids Res. 23, 3882–3886. Goldstein, D. B.; Pollock, D. D., 1997: Launching microsatellites: a review of mutation processes and methods of phylogenetic inference. J. Hered. 88, 335–342. Goldstein, D. B.; Ruis Linares, A.; Cavalli-Sforza, L. L.; Feldman, M. W., 1995a: Genetic absolute dating based on microsatellites and the origin of modern humans. Proc. Natl Acad. Sci. USA 92, 6723–6727. Goldstein, D. B.; Ruis Linares A.; Cavalli-Sforza, L. L.; Feldman, M. W., 1995b: An evaluation of genetic distances for use with microsatellite loci. Genetics 139, 463–471. Harr, B.; Schlötterer, C., 2000: Long microsatellite alleles in Drosophila melanogaster have a downward mutation bias and short persistence times, which cause their genome-wide underrepresentation. Genetics 155, 1213–1220. Harr, B.; Zangerl, B.; Schlötterer, C., 2000: Removal of microsatellite interruptions by DNA replication slippages: Phylogenetic evidence from Drosophila. Mol. Biol. Evol. 17, 1001–1009. Hartl, D. L.; Clark, A. G., 1997: Principles of Population Genetics. Sunderland, MA: Sinauer Associates. Kruglyak, S.; Durrett, R. T.; Schug, M. D.; Aquadro, C. F., 1998: Equilibrium distributions of microsatellite repeat length resulting from a balance between slippage events and point mutations. Proc. Natl Acad. Sci. USA 95, 10774–10778. Lai, Y.; Sun, F., 2003: The relationship between microsatellite slippage mutation rate and the number of repeat units. Mol. Biol. Evol. 20, 2123–2131. Levinson, G.; Gutman, G. A., 1987: Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4, 203–221. Li, Y.-C.; Korol, A. B.; Fahima, T.; Beiles, V.; Nevo, E., 2002: Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Mol. Ecol. 11, 2453–2465. Litt, M.; Luty, J. A., 1989: A hypervariable microsatellite revealed by in vitro amplification of a dinucleotide repeat within the cardiac muscle actin gene. Am. J. Hum. Genet. 44, 397–401. Nauta, M. J.; Weissing, F. J., 1996: Constraints on allele size at microsatellite loci: Implications of genetic differentiation. Genetics 143, 1021–1032. Noor, M. A. F.; Schug, M. D.; Aquadro, C. F., 2000: Microsatellite variation in populations of Drosophila pseudoobscura and Drosophila persimilis. Genet. Res. 75, 25–35. Orti, G.; Pearse, D. E.; Avise, J. C., 1997: Phylogenetic assessment of length variation at a micorsatellite locus. Proc. Natl Acad. Sci. USA 94, 10745–10749. Otha, T.; Kimura, M., 1973: A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population. Genet. Res. 22, 201–204. 222 Primmer, C. R.; Ellegren, H., 1998: Patterns of molecular evolution in avian microsatellites. Mol. Biol. Evol. 15, 997–1008. Pritchard, J. K.; Feldman, M. W., 1996: Statistics for microsatellite variation based on coalescence. Theor. Pop. Biol. 50, 325–344. di Rienzo, A.; Peterson, A. C.; Garza, J. C.; Valdes, A. M.; Slatkin, M.; Freimer, N. B., 1994: Mutational processes of simple-sequence repeat loci in human populations. Proc. Natl Acad. Sci. USA 91, 3166–3170. Rose, O.; Falush, D., 1998: A threshold size for microsatellite expansion. Mol. Biol. Evol. 15, 613–615. Sanger, F., Micklen, S., Coulson, A.R. 1977: DNA sequencing with chain-terminating inhibitors. Proc. Natl Acad. Sci. USA 74, 5463– 5467. Schlötterer, C., 2000. Evolutionary dynamics of microsatellite DNA. Chromosoma 109, 365–371. Schlötterer, C.; Ritter, R.; Harr, B.; Brem, G., 1998: High mutation rate of a long microsatellite allele in Drosophila melanogaster provides evidence for allele-specific mutation rates. Mol. Biol. Evol. 15, 1269–1274. Schug, M. D.; Mackay, T. F. C.; Aquadro, C. F., 1997: Low mutation rates of microsatellite loci in Drosophila melanogaster. Nat. Genet. 15, 99–102. Schug, M. D.; Wetterstrand, K. A.; Gaudette, M. S.; Lim, R. H.; Hutter, C. M.; Aquadro, C. F., 1998: The distribution and frequency of microsatellite loci in Drosophila melanogaster. Mol. Ecol. 7, 57–70. Sibly, R. M.; Whittaker, J. C.; Talbot, M., 2001: A maximumlikelihood approach to fitting equilibrium models of microsatellite evolution. Mol. Biol. Evol. 18, 413–417. Slatkin, M., 1995: Hitchhiking and associative overdominance at a microsatellite locus. Mol. Biol. Evol. 12, 473–480. Stephan, W.; Kim, Y., 1998: Persistence of microsatellite arrays in infinite populations. Mol. Biol. Evol. 15, 1332–1336. Bachmann, Bareiß and Tomiuk Tautz, D., 1993: Notes on the definition and nomenclature of tandemly repetitive DNA sequences. In: Pena, S. D. J.; Chakraborty, R.; Epplen, J. T.; Jeffreys, A. J. (eds), DNA Fingerprinting: State of the Science. Basel: Birkhäuser Verlag, pp. 21–28. Tautz, D.; Renz, M., 1984: Simple sequences are ubiquitous repetitive components of eukaryotic genomes. Nucleic Acids Res. 12, 4127– 4138. Tautz, D.; Schlötterer, C., 1994: Simple sequences. Curr. Opin. Genet. Dev. 4, 832–837. Tsitrone, A.; Rousset, F.; David, P., 2001: Heterosis, marker mutational processes and population inbreeding history. Genetics 159, 1845–1859. Vigouroux, Y; Jaqueth, J. S.; Matsuoka, Y.; Smith, O. S.; Beavis, W. D.; Smith, J. S.; Doebley, J., 2002: Rate and pattern of mutation at microsatellite loci in maize. Mol. Biol. Evol. 19, 1251–1260. Weber, J. L.; Wong, C., 1993. Mutation of human short tandem repeats. Hum. Mol. Genet. 2, 1123–1128. Wierdl, M.; Dominska, M.; Petes, T. D., 1997: Microsatellite instability in yeast: dependence on the length of the microsatellite. Genetics 146, 769–779. Zhivotovsky, L. A.; Feldman, M. W.; Grishechkin, S. A., 1997: Biased mutations and microsatellite variation. Mol. Biol. Evol. 14, 926–933. Authors’ addresses: Dr Lutz Bachmann (for correspondence), Section for Zoology, Natural History Museums and Botanical Garden, University of Oslo, PO Box 1172 Blindern, N-0318 Oslo, Norway. E-mail: bachmann@nhm.uio.no; Petra Bareiß and Jürgen Tomiuk, Division of General Human Genetics, Institute of Anthropology and Human Genetics, University of Tübingen, Wilhelmstrasse 27, D-72074 Tübingen, Germany