Appendix S1. Conceptual flowchart for developing new microsatellite markers based on the enrichment technique (one of many methods that are in use – see Zane et al. 2004 and Glenn 2005), and primer optimization steps. A. Extract DNA from a single tissue sample. B. Create a DNA library: 1. Cut the genome into 500 bp fragments pieces with a restriction enzyme digest. 2. Attach ‘linker’ DNA to the ends of each fragment – linker DNA has a known sequence so that primers can be designed to bind to them. 3. Amplify the DNA fragments using primers for the linker ends with PCR. C. Separate out fragments with repeat sequences: 1. Mix the DNA fragments with a microsatellite probe (an oligonucleotide made of a repeat sequence of your choice) that can be recovered magnetically. 2. Promote the hybridization of probes to any complementary repeat sequences in the DNA fragments by heating to denature the DNA and cooling slowly. 3. Hold a magnet to the tube to attract the probes (now bound to the DNA), and wash away the rest of the unbound DNA with a series of rinses. D. Sequence the fragments to find microsatellite loci: 1. Using primers for the linker DNA, amplify DNA with PCR to concentrate it. 2. Clone the DNA to prepare it for sequencing - insert it into a plasmid, inoculate bacteria with the plasmid, grow the bacteria to replicate the DNA. 3. Isolate the DNA from the bacteria. 4. Sequence microsatellite DNA in the plasmid with primers targeted to the insertion points on the plasmid. E. Examine the sequences to find microsatellite repeats. F. Design primers for the flanking region of the microsatellites (with help from a primer selection software program such as Primer3 which selects optimal primer sites) and have them made. G. Attempt amplification of loci with the new primers. Use a gradient of PCR conditions in which the temperatures, times, magnesium and primer concentrations vary to find optimal conditions. H. Use gel electrophoresis to confirm the presence of PCR products. Discard primer pairs that fail to amplify after several attempts. I. Check for polymorphism by running the successful primer pairs on 10-20 individuals. Estimate allelic diversity and heterozygosity levels. Discard invariant loci. J. Check for reliability. Rerun the successful primer pairs on the same individuals twice more to ensure that genotype scoring is consistently reproducible. Discard loci with unreliable amplification. H. Order fluorescently labeled primers for the remaining loci. Complete the full screening process detailed in the text. Discard problematic loci. K. Streamline the genotyping of the full dataset with the remaining loci by establishing a “multiplex” PCR protocol – primers for multiple loci (labeled with different dyes) are amplified in a single PCR reaction. 1 Appendix S2. Scoring microsatellite genotypes from sequencer data output. Background on microsatellite genotyping with a DNA sequencer A DNA sequencer is a highly precise gel electrophoresis apparatus. PCR products are loaded onto the gel and separated by size by applying a charge. A laser scans the gel to detect bands containing fluorescent dye. The primers used in the PCR reaction are tagged with different fluorescent dyes to enable this detection. The sequencer software converts the banding pattern into a plot with peaks corresponding to the width and intensity (height) of each band. The position of the peak along the x-axis corresponds to the size of the DNA product in the band measured in base pairs (BP). The height/intensity corresponds to the concentration of the DNA product, which is a consequence of the efficiency of the amplification process in PCR. One color is used for a size standard to calibrate the band positions with the size of the DNA product (here it is red). An asterisk identifies all true microsatellite alleles in the figures below. A B C D 100 BP 125 150 175 200 A An ideal output: The two alleles of this heterozygote are even in height and easy to distinguish from the “stutter peaks” adjacent to them – during PCR some products are 1, 2 or 3 repeats short due to errors in replication (similar to step-wise mutation) and show up as evenly spaced peaks with decreasing height to the left of the true peak. Some loci show extensive stuttering and others show virtually none. The stutter bands are useful for distinguishing microsatellite products from non-specific or non-target products. Notice that there are “pull-up peaks” in the green section of the spectrum. Pull-up is due primarily to spectral overlap in the emission spectra of the dyes which the sequencer records as a false peak in a different color. This is a common artifact of the DNA sequencer’s analysis process that can create confusion in some situations. B Two examples of trickier outputs: The blue genotype is a heterozygote but the 2 alleles are only 1 repeat different in size. This creates a characteristic pattern for loci with stutter in which the second peak is higher than the first because of the additive intensity of the larger allele’s first stutter peak and the smaller allele’s true peak. If the first two peaks were equal in height it would be difficult to determine if the 2 genotype has one or two alleles. One clue is that there are 4 stutter peaks to the right of the largest peak instead of 3 (assuming the pattern from plot A is characteristic of the blue locus). The second locus in this plot is shown with black dye and has larger alleles. This locus has no stutter perhaps because there are few repeats so that the Taq polymerase doesn’t make stepwise errors. This genotype is also a heterozygote, but the larger allele is faint. Larger alleles usually show at least slightly shorter peaks because PCR is less efficient for longer products. Here, the larger allele is so small that it could be easily overlooked or mistaken for noise. If any less PCR product were loaded onto the gel it might not show up at all – in which case it would be an example of “large allele drop-out,” another common source of inflated homozygosity counts. C More examples: When multiple loci are loaded in the same gel lane for efficiency, or amplified together in one PCR “multiplex” reaction, allele peaks can overlap and be sometimes easy to miss. Here the green and blue loci share an allele size. Distinguishing the true alleles is made even more difficult due to the occurrence of green pull-up peaks. If the green allele product were less intense than the blue allele product (instead of equal as shown here), it might be mistaken for a pull up peak. Re-running the green locus separately in such cases will minimize scoring error. Here again, the blue locus shows two alleles that differ by one base pair. The heights are the same because this locus does not show strong stutter. But the blue locus does show small flanking peaks – the left side is a very faint stutter so only the largest stutter peak is visible, and the right side might be an “A-Tail,” when the Taq adds an extra adenine nucleotide onto some copies of the product, increasing it by 1 bp. It will not be mistaken for a microsatellite allele because an extreme height difference would not occur for 2 alleles so close in size, as their amplification efficiencies should be similar. The black locus is a homozygote. Even though there is a small black peak on the left side of the plot, a smaller allele is almost never shorter than a larger allele. An additional clue is that the larger black peak is quite fat and tall -- PCR produces approximately double the amount of product when an allele is homozygous because it does not compete for the Taq with a second allele. The purple locus represents a “split peak” problem that occurs from high rates of ATailing by the Taq. The “+A” peaks occur for all of the stutter peaks, making the scoring difficult, especially for heterozygotes with closely sized alleles. Although the true allele is denoted here, a locus with alleles that look like the purple one would be too difficult to score reliably. Usually the problem can be corrected by adding an extra extension step to the PCR program that gives the Taq time to add an A-Tail to all copies of the product consistently (bumping all alleles and stutter peaks up in size by 1 bp from their true length). D Unscorable loci: The blue locus is a “stegosaur” with unacceptably high stutter. The black locus has too many non-specific artifact peaks to reliably choose the microsatellite alleles. The green locus has been overloaded on the gel or has unusually high PCR product concentration and is smearing in the lane. 3 Appendix S3. Citations for the papers used to generate Table 3. We examined the most recent papers in the journals Molecular Ecology and Evolution and chose the first 25 from each journal, in reverse chronological order from July 2005 issues, that used microsatellites to assess ecological and genetic traits of single or multiple species. We excluded studies that used microsatellite markers taken from previously published research studies to minimize the chance that some tests were done previously and therefore not reported in the current study. We then surveyed these papers to estimate the frequency and results of reported marker screening as explained in Part III of the text and Table 2. Molecular Ecology Astanei, I., Gosling, E., Wilson, J. & Powell, E. (2005). Genetic variability and phylogeography of the invasive zebra mussel, Dreissena polymorpha (Pallas). Molecular Ecology, 14, 1655-1666. Baums, I.B., Miller, M.W. & Hellberg, M.E. (2005). Regionally isolated populations of an imperiled Caribbean coral, Acropora palmata. Molecular Ecology, 14, 1377-1390. Bottin, L., Verhaegen, D., Tassin, J., Olivieri, I., Vaillant, A. & Bouvet, J.M. (2005). Genetic diversity and population structure of an insular tree, Santalum austrocaledonicum in New Caledonian archipelago. Molecular Ecology, 14, 1979-1989. Bowen, B.W., Bass, A.L., Soares, L. & Toonen, R.J. (2005). Conservation implications of complex population structure: lessons from the loggerhead turtle (Caretta caretta). Molecular Ecology, 14, 2389-2402. Charmantier, A. & Reale, D. (2005). How do misassigned paternities affect the estimation of heritability in the wild? Molecular Ecology, 14, 2839-2850. Colautti, R.I., Manca, M., Viljanen, M., Ketelaars, H.A.M., Burgi, H., Macisaac, H.J. & Heath, D.D. (2005). Invasion genetics of the Eurasian spiny waterflea: evidence for bottlenecks and gene flow using microsatellites. Molecular Ecology, 14, 1869-1879. Fredsted, T., Pertoldi, C., Schierup, M.H. & Kappeler, P.M. (2005). Microsatellite analyses reveal fine-scale genetic structure in grey mouse lemurs (Microcebus murinus). Molecular Ecology, 14, 2363-2372. Funk, W.C., Blouin, M.S., Corn, P.S., Maxell, B.A., Pilliod, D.S., Amish, S. & Allendorf, F.W. (2005). Population structure of Columbia spotted frogs (Rana luteiventris) is strongly affected by the landscape. Molecular Ecology, 14, 483-496. Goossens, B., Chikhi, L., Jalil, M.F., Ancrenaz, M., Lackman-Ancrenaz, I., Mohamed, M., Andau, P. & Bruford, M.W. (2005). Patterns of genetic diversity and migration in increasingly fragmented and declining orang-utan (Pongo pygmaeus) populations from Sabah, Malaysia. Molecular Ecology, 14, 441-456. Hauswaldt, J.S. & Glenn, T.C. (2005). Population genetics of the diamondback terrapin (Malaclemys terrapin). Molecular Ecology, 14, 723-732. Jones, K.L., Krapu, G.L., Brandt, D.A. & Ashley, M.V. (2005). Population genetic structure in migratory sandhill cranes and the role of Pleistocene glaciations. Molecular Ecology, 14, 2645-2657. Keeney, D.B., Heupel, M.R., Hueter, R.E. & Heist, E.J. (2005). Microsatellite and mitochondrial DNA analyses of the genetic structure of blacktip shark (Carcharhinus limbatus) nurseries in the northwestern Atlantic, Gulf of Mexico, and Caribbean Sea. Molecular Ecology, 14, 1911-1923. 4 Magalon, H., Adjeroud, M. & Veuille, M. (2005). Patterns of genetic variation do not correlate with geographical distance in the reef-building coral Pocillopora meandrina in the South Pacific. Molecular Ecology, 14, 1861-1868. Maki-Petays, H., Zakharov, A., Viljakainen, L., Corander, J. & Pamilo, P. (2005). Genetic changes associated to declining populations of Formica ants in fragmented forest landscape. Molecular Ecology, 14, 733-742. McRae, B.H., Beier, P., Dewald, L.E., Huynh, L.Y. & Keim, P. (2005). Habitat barriers limit gene flow and illuminate historical events in a wide-ranging carnivore, the American puma. Molecular Ecology, 14, 1965-1977. Mesquita, N., Hanfling, B., Carvalho, G.R. & Coelho, M.M. (2005). Phylogeography of the cyprinid Squalius aradensis and implications for conservation of the endemic freshwater fauna of southern Portugal. Molecular Ecology, 14, 1939-1954. Michaux, J.R., Hardy, O.J., Justy, F., Fournier, P., Kranz, A., Cabria, M., Davison, A., Rosoux, R. & Libois, R. (2005). Conservation genetics and population history of the threatened European mink Mustela lutreola, with an emphasis on the west European population. Molecular Ecology, 14, 2373-2388. Otero-Arnaiz, A., Casas, A., Hamrick, J.L. & Cruse-Sanders, J. (2005). Genetic variation and evolution of Polaskia chichipe (Cactaceae) under domestication in the Tehuacan Valley, central Mexico. Molecular Ecology, 14, 1603-1611. Oyler-McCance, S.J., Taylor, S.E. & Quinn, T.W. (2005). A multilocus population genetic survey of the greater sage-grouse across their range. Molecular Ecology, 14, 1293-1310. Shrivastava, J., Qian, B.Z., McVean, G. & Webster, J.P. (2005). An insight into the genetic variation of Schistosoma japonicum in mainland China using DNA microsatellite markers. Molecular Ecology, 14, 839-849. Spear, S.F., Peterson, C.R., Matocq, M.D. & Storfer, A. (2005). Landscape genetics of the blotched tiger salamander (Ambystoma tigrinum melanostictum). Molecular Ecology, 14, 2553-2564. Sutherland, D.R., Spencer, P.B.S., Singleton, G.R. & Taylor, A.C. (2005). Kin interactions and changing social structure during a population outbreak of feral house mice. Molecular Ecology, 14, 2803-2814. Westneat, D.F. & Mays, H.L. (2005). Tests of spatial and temporal factors influencing extra-pair paternity in red-winged blackbirds. Molecular Ecology, 14, 2155-2167. Whitehead, A., Anderson, S.L., Kuivila, K.M., Roach, J.L. & May, B. (2003). Genetic variation among interconnected populations of Catostomus occidentalis: implications for distinguishing impacts of contaminants from biogeographical structuring. Molecular Ecology, 12, 2817-2833. Wright, T.F., Rodriguez, A.M. & Fleischer, R.C. (2005). Vocal dialects, sex-biased dispersal, and microsatellite population structure in the parrot Amazona auropalliata. Molecular Ecology, 14, 1197-1205. Evolution Allendorf, F.W. & Seeb, L.W. (2000). Concordance of genetic divergence among sockeye salmon populations at allozyme, nuclear DNA, and mitochondrial DNA markers. Evolution, 54, 640-651. 5 Arnegard, M.E., Bogdanowicz, S.M. & Hopkins, C.D. (2005). Multiple cases of striking genetic similarity between alternate electric fish signal morphs in sympatry. Evolution, 59, 324343. Bacles, C.F.E., Burczyk, J., Lowe, A.J. & Ennos, R.A. (2005). Historical and contemporary mating patterns in remnant populations of the forest tree Fraxinus excelsior L. Evolution, 59, 979-990. Castric, V., Bonney, F. & Bernatchez, L. (2001). Landscape structure and hierarchical genetic diversity in the brook charr, Salvelinus fontinalis. Evolution, 55, 1016-1028. Chapuisat, M., Bocherens, S. & Rosset, H. (2004). Variable queen number in ant colonies: No impact on queen turnover, inbreeding, and population genetic differentiation in the ant Formica selysi. Evolution, 58, 1064-1072. Clarke, K.E., Rinderer, T.E., Franck, P., Quezada-Euan, J.G. & Oldroyd, B.P. (2002). The Africanization of honeybees (Apis mellifera L.) of the Yucatan: A study of a massive hybridization event across time. Evolution, 56, 1462-1474. Dutech, C., Maggia, L., Tardy, C., Joly, H.I. & Jarne, P. (2003). Tracking a genetic signal of extinction-recolonization events in a neotropical tree species: Vouacapoua americana aublet in french guiana. Evolution, 57, 2753-2764. Evans, B.J., Supriatna, J. & Melnick, D.J. (2001). Hybridization and population genetics of two macaque species in Sulawesi, Indonesia. Evolution, 55, 1686-1702. Goodisman, M.A.D. & Crozier, R.H. (2002). Population and colony genetic structure of the primitive termite Mastotermes darwiniensis. Evolution, 56, 70-83. Hansson, B., Westerdahl, H., Hasselquist, D., Akesson, M. & Bensch, S. (2004). Does linkage disequilibrium generate heterozygosity-fitness correlations in great reed warblers? Evolution, 58, 870-879. Hendry, A.P., Taylor, E.B. & McPhail, J.D. (2002). Adaptive divergence and the balance between selection and gene flow: Lake and stream stickleback in the misty system. Evolution, 56, 1199-1216. Heuertz, M., Hausman, J.F., Hardy, O.J., Vendramin, G.G., Frascaria-Lacoste, N. & Vekemans, X. (2004). Nuclear microsatellites reveal contrasting patterns of genetic structure between western and southeastern European populations of the common ash (Fraxinus excelsior L.). Evolution, 58, 976-988. Hoffman, J.I., Boyd, I.L. & Amos, W. (2003). Male reproductive strategy and the importance of maternal status in the antarctic fur seal Arctocephalus gazella. Evolution, 57, 1917-1930. Hufford, K.M. & Hamrick, J.L. (2003). Viability selection at three early life stages of the tropical tree, Platypodium elegans (Fabaceae, Papilionoideae). Evolution, 57, 518-526. Lampert, K.P., Lamatsch, D.K., Epplen, J.T. & Schartl, M. (2005). Evidence for a monophyletic origin of triploid clones of the Amazon molly, Poecilia formosa. Evolution, 59, 881-889. Noor, M.A.F., Pascual, M. & Smith, K.R. (2000). Genetic variation ln the spread of Drosophila subobscura from a nonequilibrium population. Evolution, 54, 696-703. Peakall, R., Ruibal, M. & Lindenmayer, D.B. (2003). Spatial autocorrelation analysis offers new insights into gene flow in the Australian bush rat, Rattus fuscipes. Evolution, 57, 11821195. Pujolar, J.M., Maes, G.E., Vancoillie, C. & Volckaert, F.A.M. (2005). Growth rate correlates to individual heterozygosity in the european eel, Anguilla anguilla L. Evolution, 59, 189199. 6 Scribner, K.T., Arntzen, J.W., Cruddace, N., Oldham, R.S. & Burke, T. (2001). Environmental correlates of toad abundance and population genetic diversity. Biol. Conserv., 98, 201210. Storz, J.F. (2002). Contrasting patterns of divergence in quantitative traits and neutral DNA markers: analysis of clinal variation. Molecular Ecology, 11, 2537-2551. Storz, J.F., Bhat, H.R. & Kunz, T.H. (2001). Genetic consequences of polygyny and social structure in an Indian fruit bat, Cynopterus sphinx. I. Inbreeding, outbreeding, and population subdivision. Evolution, 55, 1215-1223. Thelen, G.C. & Allendorf, F.W. (2001). Heterozygosity-fitness correlations in rainbow trout: Effects of allozyme loci or associative overdominance? Evolution, 55, 1180-1187. Turgeon, J. & Bernatchez, L. (2001). Clinal variation at microsatellite loci reveals historical secondary intergradation between glacial races of Coregonus artedi (Teleostei: Coregoninae). Evolution, 55, 2274-2286. Vargo, E.L. (2003). Hierarchical analysis of colony and population genetic structure of the eastern subterranean termite, Reticulitermes flavipes, using two classes of molecular markers. Evolution, 57, 2805-2818. Vassiliadis, C., Saumitou-Laprade, P., Lepart, J. & Viard, F. (2002). High male reproductive success of hermaphrodites in the androdioecious Phillyrea angustifolia. Evolution, 56, 1362-1373. 7