1 Supplementary Text 1. Mutation Accumulation - Details of the MA protocols have been reported elsewhere [1, 2]. Briefly, in October 2008 starting stocks of the PB800 strain of C. briggsae and of the N2 strain of C. elegans were initially inbred by transfer of a single immature (L4-stage) hermaphrodite for six generations, after which populations were allowed to expand and MA lines were initiated. Ancestral control stocks were cryopreserved using standard methods [3]. Standard C. elegans husbandry conditions (NGM agar plates seeded w/100 μL of the OP50 strain of Escherichia coli) were used with the exception that the temperature at which worms were maintained was altered. A total of 192 replicated MA lines were created from each of the two starting stocks. For each, 96 replicated MA lines were maintained at 18°C and 96 lines at 26°C (henceforth "MA18" and "MA26", respectively). We initially transferred a single L4 hermaphrodite from each MA line at 3-day intervals (a “bottleneck”) for MA26 lines and 5-day intervals for MA18 lines. In January 2010, generation time was increased to a 4-day interval for the C. elegans MA26 lines (lines had undergone a maximum of 141 bottlenecks, "Gmax") and a 6-day interval for both sets of MA18 lines (Gmax = 85) because these were consistently maturing more slowly. Due to the different generation times, at the end of the two-year mutation accumulation period, Gmax was 171 for C. briggsae at 26°C, 164 for C. elegans at 26°C, and 103 for both at 18°C. Of the 96 initial lines in each treatment, none of the MA18 lines went extinct, one C. briggsae MA26 line went extinct and 18 C. elegans MA26 lines went extinct. We maintained the two generations previous to the present generation as backup stocks. If a worm failed to reproduce, the replicate was re-started with an individual from the previous generation. "Going to backup" has the effect of reducing the number of bottlenecks experienced by a line, although it does not affect the actual number of generations of MA (Gmax). In a MA experiment, mutations that are under sufficiently weak selection (4Nes<1) will accumulate at the neutral rate. Going to backup has the effect of changing Ne, and thereby the strength of selection that constitutes effective neutrality. In a population in which census size 2 (N) fluctuates over time, Ne is established by the harmonic mean N, which is essentially established by the small(est) values of N [4, p. 109]. In three of the four sets of lines (both MA temperatures in C. briggsae and MA18 in C. elegans) we almost never had to go to backup, and Ne is essentially 1. In the C. elegans MA26 lines, however, the average number of bottlenecks was 89.4/164. Assuming the average census size of a backup plate is 200 adults (approximately the average number of offspring of an individual hermaphrodite; see Table 1), Ne for the C. elegans MA26 lines is ≈ 1.8 (assuming N=2000 results in Ne ≈ 1.9). Thus, the parameter of effective neutrality s < 1 4𝑁𝑒 is s ≈ 0.14 in the MA26 lines of C. elegans, vs. s ≈ 0.25 in the other sets of lines. 2. Fitness Assay - Fitness was assayed in two blocks. For each species by MA temperature combination, 30 lines were randomly selected for each block. The two blocks contained different, non-overlapping sets of MA lines. MA lines were thawed along with ancestral controls; 15 worms were picked from the thawed sample of each control and used to establish replicate control lines. From each of the 60 MA lines (30 per MA temperature) and 15 control lines per species, 12 replicates were started, each from a single haphazardly chosen L4 hermaphrodite. Of these, 7 replicates were assayed at 26°C and the remaining 5 were assayed at 18°C (because survival was lower at 26°C during MA, the larger number of replicates at high temperature was intended to offset the expected attrition). Plates were assigned a random number and were identified only by the random number and handled only in random numerical order. Differences between generation times required that groups of worms were handled on different schedules; as a result, plates were assigned a random number within groups that corresponded to generation time (all C. briggsae assayed at 26°C one group, all C. elegans assayed at 26°C a second group, and all lines assayed at 18°C a third group). Excepting the assay temperatures and necessary changes to reflect the differences in generation time 3 between assay temperatures, the fitness assay followed our canonical assay protocol [2]. Briefly, after lines were replicated (P1), an additional two generations of single-individual descent were carried out to account for parental and grandparental effects (“parental” generations P2, P3). If the P1 worm did not survive, we re-started the line from the plate of thawed worms; if P2 or P3 worms did not reproduce that replicate was discarded and called missing data. For each replicate, a focal worm was selected haphazardly from the offspring of the P3 generation, and the lifetime reproduction of the focal worm was recorded. The focal worm was placed on a fresh, seeded NGM plate as a newly hatched offspring (L1) and returned to the incubator where it was subsequently allowed to reach maturity and lay eggs (two days at 26°C; four days at 18°C). The worm was then moved to a new plate and returned to the incubator for one day at 26°C or two days at 18°C. To capture lifetime reproduction without allowing offspring to reproduce, the focal worm was moved for a second time to a new, seeded plate and incubated for an additional two days at 26°C or four days at 18°C. All plates from which the focal individual had been removed were returned to the incubator until eggs had hatched (one day at 26°C; two days at 18°C) then stored at 4°C. Upon completion of the assay, stored plates were stained with 0.075% toluidine blue and worms were counted under a dissecting microscope. Fitness data are archived in Supplementary Table 5. 3. Microsatellite Methods 3.1 Selection of Microsatellite Loci The goal of this study was to compare how mutational properties differ with respect to temperature; to that end we selected loci that would maximize the probability of observing mutations. A previous study showed that di-nucleotide microsatellite mutation rates in Caenorhabditis are dependent on both repeat type and repeat length [5], and bioinformatic analysis indicated AG di-nucleotide repeats are the most abundant repeat type in all five of the currently available Caenorhabditis genome assemblies. Therefore, we focused on AG di- 4 nucleotide microsatellite repeats, and, because microsatellite mutation rates are positively correlated with the number of repeats [5, 6] and our aim was to maximize the number of mutations observed, we focused on longer repeats. We began by selecting primer pairs for AG repeats from our previous microsatellite study [5]. These microsatellite loci were ordered by number of repeats, and the 16 largest (excepting locus 103/104 which was deemed an outlier and not included in this study) were selected for use in C. briggsae. In C. elegans, loci used in [5] yielded an insufficient number of comparably sized microsatellites (≥9 repeats). As a result, new, size-matched loci were identified for C. elegans according to the following protocol. All perfect di-nucleotide microsatellites with 5 repeats (10 bp in length) or greater were identified in the published genome of C. elegans (build WS205) using the PHOBOS algorithm version 3.3.12 (Christoph Mayer, Ruhr-Universität Bochum). The following parameters were used: -M imperfect, -u minimum repeat length = 2, -U maximum repeat length = 2, -m mismatch score = -6, -r recursion depth = 7, -s minimum length score = 8, -f number of bases flanking a repeat = 250 bp. These search results were further filtered by repeat perfection and length using a set of custom Perl scripts. Loci from the 90th to 99th percentiles of the length distribution were selected since this range was similar in size to the previously selected C. briggsae loci. PCR primers for the new C. elegans loci were designed using Primer 3 software [7] for each locus (plus 250 bp on either side of the locus) to generate an in silico predicted “PCR fragment” using the default parameters. PCR fragment size was constrained to between 100 bp and 400 bp for all loci. Using the PHOBOS algorithm version 3.3.12 (Christoph Mayer, Ruhr-Universität Bochum), predicted “PCR products” then were screened for the presence of any repeat motif (period size of 1 bp to 100 bp) other than the focal perfect AG di-nucleotide. The PHOBOS parameters were: -M imperfect, -u minimum repeat length = 2, -U maximum repeat length = 100, -m mismatch score = -6, -r recursion depth = 7, -s minimum length score = 8, -f number of bases flanking a repeat = 250 bp. If predicted “PCR products” contained repeats other than the focal 5 AG di-nucleotide repeat, the locus was removed from the list. From the remaining list, 15 were selected. A complete list of primers for both species is included in Supplementary Table 1. 3.2 Genotyping Genomic DNA was extracted from each MA line using the Qiagen 96 well DNeasy Blood and Tissue kit following the manufacturers protocol (Qiagen, USA). We employed a nested PCR strategy with fluorescently tagged primers via a modification of the "three-primer" method of Schuelke [8]. Multiplexed PCR reactions of 15 µl were performed in 96-well plates, using 1 ul of DNA template, 40 pmol of selective primer, 4 pmol of M13-tail primer, 60 pmol of (FAM) labeled M13 primer, and 7.5 ul of Qiagen Type-it Microsatellite PCR kit master mix (Qiagen, USA). Two to four loci were amplified simultaneously in a multiplex reaction with ~ 50 bp separating each locus; in some cases the amount of selective primer and M13-tail primer were altered slightly to promote even amplification of different-sized fragments. PCR reactions were run for an initial denaturation of 5 minutes at 95°C followed by 10 cycles of 30 seconds denaturation at 95°C, 90 seconds annealing at 60°C, and 30 seconds extension at 72°C; the annealing temperature then was decreased to 48°C and the reaction was continued for an additional 18 cycles before a final extension step of 30 minutes at 60°C. Because many of the PCR products were similarly sized (constraining multiplexing), some loci were labeled with a second fluorescent dye (NED). These single-locus amplifications with the NED label were performed in 96-well plates, using 1 ul of DNA template, 60 pmol of selective primer, 6 pmol of M13-tail primer, 60 pmol of (NED) labeled M13 primer, and 7.5 ul of Promega Go Taq Colorless Master Mix (Promega, USA), and PCR reactions were run for 10 cycles of 40 seconds denaturing at 94°C, 40 seconds annealing at 60°C, and 40 seconds extension at 72°C before the annealing temperature was decreased to 48°C for an additional 18 cycles. Multiplex and single locus PCR products were pooled such that no loci were spaced closer than 20 bp, and PCR products were analyzed using an Applied Biosystems 3730XL DNA analyzer (Interdisciplinary Center for Biotechnology Research, University of Florida, USA). 6 Fragment length was established relative to a known size-standard ladder (GeneScan 600, Applied Biosystems, USA). All genotypes were manually inspected using the GeneMarker version 1.6 software (SoftGenetics, USA). Any genotype (i.e., fragment length) identified as different from the wild-type homozygote was re-genotyped from an independent PCR amplification; if the same genotype was observed in the re-genotyping it was scored as a mutation. 4. Bootstrap analysis of ΔMw - Each Species/Assay temperature/Block combination constitutes an independent experiment, within which the two sets of MA lines (MA18 and MA26) are compared to the same set of ancestral (G0) control lines; the category "MA treatment" includes G0, MA18, and MA26. There are eight experiments in total, two assay blocks from each species at each assay temperature. Data from each experiment (ancestral control pseudolines and MA lines) were resampled with replacement by line (i.e., all replicates within a line were present in the sample) from each assay block, maintaining the block structure (i.e., the number of lines within each MA treatment in each block is held constant). Maintaining the block structure implicitly considers block a fixed effect rather than a random effect. For each bootstrap replicate, ΔMw was calculated for each block as described in the text and averaged over blocks. This procedure was repeated 1000 times; the upper and lower 2.5% of the resampled distribution of the average ΔMw establish the upper and lower 95% confidence interval for each group [9]. 5. Calculation and bootstrap analysis of VM - The per-generation increase in genetic variance (the mutational variance, VM) is the product of the genomic mutation rate (U) and the square of the average mutational effect, E[a2]. VM was calculated from the resampled data described in Section 4 as follows. Each data point (w) was divided by the relevant G0 control (Species/Assay temperature/Block) mean, giving a re-scaled estimate of relative fitness, w*. 7 The variance of the re-scaled data, Var(w*), is the variance of the raw values divided by the square of the trait mean, what is often designated I, the "opportunity for selection" [10] and is the most appropriate measure of variance for a trait under directional selection [11, 12]. We refer to (IMA-I0)/2t as VM, where IMA and I0 refer to the among-line components of variance in the MA lines and G0 control pseudolines, respectively, and t represents the number of generations of MA. For each group (Species/MA treatment/Assay temperature/Block) we estimated the among-line and within-line components of variance using restricted maximum likelihood (REML) as implemented in the MIXED procedure of SAS v. 9.21. Degrees of freedom were calculated by the Kenward-Rogers method. 95% bootstrap confidence intervals were calculated as in Section 4. 6. Calculation of mutation parameters UMIN and E[a]MAX. If it is assumed that all mutations have equal effects, 2(ΔM)2/VM provides a downwardly biased estimate of the genomic (diploid) mutation rate for alleles that affect the trait, UMIN, and VM/(2ΔM) provides an upwardly-biased estimate of the average effect of a mutation on the trait, E[a]MAX [13, 14]. UMIN and E[a]MAX were calculated from the between-block averages of ΔMw and VM described in sections 4 and 5. We only report point estimates and do not attempt to assign statistical significance to the estimates, for two reasons. First, most sets of bootstrap replicates include replicates in which the REML estimate of VM is (1) equal to 0, in which case UMIN is undefined, converging on positive infinity, and E[a]MAX is 0, and (2) non-zero but extremely small, leading to inflated estimates of UMIN. Ignoring these replicates biases the estimate of the sampling variance in an unknown way. More importantly, in three of the eight cases the lower bound on VM is 0 and ΔMw is significantly different from 0, in which case the logical upper bound on UMIN is positive infinity. Rather than report an unreliable estimate of the sampling variances of the B-M parameters, we prefer to make no statement about the sampling variances and emphasize that the point estimates be interpreted with considerable caution. 8 References 1. Vassilieva L.L., Lynch M. 1999 The rate of spontaneous mutation for life-history traits in Caenorhabditis elegans. Genetics 151, 119-129. 2. Baer C.F., Shaw F., Steding C., Baumgartner M., Hawkins A., Houppert A., Mason N., Reed M., Simonelic K., Woodard W., et al. 2005 Comparative evolutionary genetics of spontaneous mutations affecting fitness in rhabditid nematodes. Proceedings of the National Academy of Sciences of the United States of America 102, 5785-5790. 3. Wood W.B. 1988 The Nematode Caenorhabditis elegans. (pp. 1-16. Plainview, NY, Cold Spring Harbor Laboratory Press. 4. Crow J.F., Kimura M. 1970 An Introduction to Population Genetics Theory. Caldwell, NJ, The Blackburn Press. 5. Phillips N., Salomon M., Custer A., Ostrow D., Baer C.F. 2009 Spontaneous mutational and standing genetic (co)variation at dinucleotide microsatellites in Caenorhabditis briggsae and Caenorhabditis elegans. Molecular Biology and Evolution 26, 659-669. (doi:10.1093/molbev/msn287). 6. Ellegren H. 2004 Microsatellites: Simple sequences with complex evolution. Nature Reviews Genetics 5, 435-445. (doi:10.1038/nrg1348). 7. Rozen S., Skaletsky H.J. 2000 Primer3 on the WWW for general users and for biologist programmers. In Bioinformatics Methods and Protocols: Methods in Molecular Biology (eds. Krawetz S., Misener S.), pp. 365-386. Totowa, NJ, Humana Press. 8. Schuelke M. 2000 An economic method for the fluorescent labeling of PCR fragments. Nature Biotechnology 18, 233-234. 9. Hall. Efron B., Tibshirani R.J. 1993 An Introduction to the Bootstrap. New York, Chapman and 9 10. Crow J.F. 1958 Some possibilities for measuring selection intensities in man. Human Biology 30, 1-13. 11. Houle D. 1992 Comparing evolvability and variability of quantitative traits. Genetics 130, 195-204. 12. Wade M.J. 2006 Selection. In Evolutionary Genetics (eds. Fox C.W., Wolf J.B.), pp. 49- 64. New York, Oxford University Press. 13. Bateman K.G. 1959 The genetic assimilation of four venation phenocopies. Journal of Genetics 56, 443-447. 14. Mukai T. 1964 Genetic structure of natural populations of Drosophila melanogaster. 1. Spontaneous mutation rate of polygenes controlling viability. Genetics 50, 1-19.