A century-long genetic record reveals that protist effective population sizes are comparable to those of macroscopic species. Phillip C. Watts1,2,*, Nina Lundholm3, Sofia Ribeiro4, Marianne Ellegaard5 1 Institute of Integrative Biology, Biosciences Building, University of Liverpool, Liverpool, L69 7ZB. UK. 2 Department of Biological and Environmental Science, University of Jyväskylä, P.O. Box 35, FI- 40014 Jyväskylä, Finland. 3 The Natural History Museum of Denmark, Sølvgade 83S, DK-1307 Kbh K, Denmark. 4 Geological Survey of Denmark and Greenland, Department of Marine Geology and Glaciology, Øster Voldgade 10, 1350 KBH-K, Denmark. 5 Department of Biology, University of Copenhagen, Øster Farimagsgade 2D, DK-1353 Copenhagen K, Denmark. Electronic Supplementary Material (ESM) 1. Methods The total abundance of cells in Koljö Fjord was estimated using data on the densities of Pentapharsodinium dalei in a neighboring fjord (Gullmar Fjord) and the bathymetry of Koljö Fjord, and are estimates of the population size of the vegetative (increasing by cell division) stages at a given time. Godhe et al. [1] quantified numbers of cysts and vegetative stages of P. dalei / Scrippsiella spp. (which are difficult to tell apart in the vegetative stage) from sediment traps. The maximal number of vegetative cells l-1 was 15,000, but as Godhe and co-workers typically found around a few 1,000 cells l-1 we selected a conservative value of 1,500 cells l-1. The area of Koljö Fjord is 12.9 km-2, the halocline situated at ca. 15 m and there is limited water exchange between the Fjord and the Kattegat [2]. It is reasonable to assume that the vegetative cells of P. dalei are distributed throughout the water above the halocline. The estimated number of vegetative cells was thus calculated as: 1,500 cells ml-1 x 1,000 l m-3 x 15 m = 22,500,000 cells m-2 above the halocline. Total number of cells in the fjord = 22,500,000 cells m-2 x 12,900,000 m2 = 2.9 x1014 cells. We reanalysed our data set to examine the effects of (1) potential null alleles (i.e. the few nonamplifying data) and (2) uncertainty in dating the slices of sediment core upon our estimates of contemporary Ne. First, since our genotype data are based on the haploid phase, we treated missing data as a null allele by recoding non-amplifying genotypes as new, distinct alleles. Second, we calculated maximum and minimum time intervals between all pairs of samples based on the upper and lower 95% confidence intervals (95% CI) for the dates associated with each sediment core slice (see main text Methods for the standard errors associated with each core date); 95% CIs were calculated as twice the standard error. Note that there was no error associated with date of the top core layer (the 2006 layer). Following these two treatments of the original data, contemporary estimates of Ne (and 95% CI) for all sample pairs were made using NEESTIMATOR v.2 [3] to calculate Ne based on Waples’ moment method [4] that incorporated Nei & Tajima’s [5] standardised variance in allele frequency. As in the main manuscript, our calculations assumed a one year generation time (i.e. the timing of the sexual cycle) and a closed population. We did not use MLNe [6] as the estimates of Ne generated by this software are similar to that provided by Waples’ estimator (see Table 1 main manuscript). 2. Results The overall effect of recoding missing data was minimal, with the estimates of Ne based on data that incorporated distinct null alleles slightly higher than the estimates of Ne that simply allowed missing data. All estimates of Ne were of the same order of magnitude and had overlapping 95% CIs (Table ESM1). The effects of increasing (maximum temporal separation) and decreasing (minimum temporal separation) the estimated number of generations among samples were to increase and reduce the estimate of Ne respectively. This is not surprising as the rate of genetic change is associated with Ne. However, it is notable that none of the new estimates of Ne are particularly huge (i.e. the estimates are mostly a few hundred), ranging between 246 (for estimated core slice dates of 20061979) and 1,650 (for estimated core slice dates of 2006-1898), and with a maximum upper 95% CI of 8,376 (Table ESM 2). References 1. Godhe A, Noren F, Kuylenstierna M, Ekberg C, Karlson B. 2001 Relationship between planktonic dinoflagellate abundance, cysts recovered in sediment traps and environmental factors in the Gullmar Fjord, Sweden. J. Plank. Res. 23, 923-938. 2. Svenningsen L. 2010 A snapshot of a deep water renewal event in the Koljö Fjord system. BSc. Thesis, University of Gothenburg. 3. Do C, Waples RS, Peel D, Macbeth GM, Tillet BJ, Ovenden JR. 2013 NeEstimator V2:reimplementation of software for the estimation of contemporary effective population size (Ne) from genetic data. Mol. Ecol. Res. In press. 4. Waples RS. 1989 A generalised approach for estimating effective population size from temporal changes in allele frequency. Genetics 121, 379-391. 5. Nei M, Tajima F. 1981 Genetic drift and estimation of effective population size. Genetics 98, 625-640. 6. Wang J, Whitlock MC. 2003 Estimating effective population size and migration rates from genetic samples over space and time. Genetics 163, 429-446. Table ESM1. Effect of coding missing (potential null) alleles as either missing data or distinct alleles upon estimates of contemporary effective population size (Ne) for samples of the marine dinoflagellate Pentapharsodinum dalei revived from sediment cores (dated to 1922, 1960, 1985 and 2006) from Koljö Fjord, Sweden. Average contemporary Ne (±95% confidence intervals) was estimated for pairs of samples using Waples’ [4] moment estimator. 2006 - 1985 2006 - 1960 2006 - 1922 1985 - 1960 1985 - 1922 1960 - 1922 missing data nulls recoded Ne (±95% CI) Ne (±95% CI) 178 (98-343) 191 (107-361) 375 (195-793) 404 (213-841) 1,183 (476-5,545) 1,283 (512-6,514) 342 (166-905) 418 (199-1,221) 652 (325-1,528) 697 (355-1,591) 266 (134-570) 291 (150-621) Table ESM2. Effect of uncertainty associated with dating sediment cores upon estimates of contemporary effective population size (Ne) for samples of P. dalei from Koljö Fjord, Sweden. Dates for slices of sediment core are based on the upper and lower 95% CIs associated with sediment dating to provide maximum and minimum temporal separation between pairs of samples. Average contemporary Ne (±95% confidence intervals) was estimated for pairs of samples using Waples’ [4] moment estimator. maximum temporal separation sample comparison Ne (±95% CI) missing data 2006 - 1979 229 (126-441) 2006 - 1950 457 (237-966) 2006 - 1898 1,520 (611-7,129) nulls recoded 2006 - 1979 246 (138-464) 2006 - 1950 492 (259-1,023) 2006 - 1898 1,650 (658-8,376) minimum temporal separation sample comparison Ne (±95% CI) 2006 - 1991 127 (70-245) 2006 - 1970 294 (152-621) 2006 - 1946 845 (340-3,961) 2006 - 1991 136 (76-258) 2006 - 1970 316 (167-658) 2006 - 1946 917 (366-4,653)