DEVELOPMENT OF A METHODOLOGY FOR THE APPLICATION OF SYNTHETIC DNA IN STREAM TRACER INJECTION EXPERIMENTS (SUPPORTING INFORMATION) Jan Willem Foppena*, Judith Seopaa, Noel Bakobiea, and Thom Bogaarda,b 5 a UNESCO-IHE Institute for Water Education, P.O. Box 3015, 2601 DA Delft, The Netherlands; e-mail: j.foppen@unesco-ihe.org b Delft University of Technology, Water Resources Section, Delft, The Netherlands *Corresponding author 10 Introduction This text provides background information on the preliminary steps in the protocol, e.g. determining the type of batch container and the optimal DNA sampling technique. Then, a detailed account is given on how we determined DNA concentrations, 15 including inhibition, standard curve efficiency, and individual amplification curve efficiencies. Finally, the way in which we determined confidence intervals of DNA concentrations is described. Materials and Methods 20 Preliminary steps in the development of the protocol The initial steps in the development of the protocol were to determine the effect of batch containers and water sample taking techniques on synthetic DNA concentrations, and to select an adequate type of container and a suitable sample taking technique. In our laboratory, we tested glass bottles (250 ml), polyethylene 25 (PE) bottles (250 ml), and polypropylene co-polymer (PPCO) bottles (250 ml; Nalgene, Thermo Fisher Scientific). To each of the bottles, 100 ml of MilliQ water 1 and 450 μl of a concentrated solution of T22 and T23 was added. The bottles were hand shaken and samples were taken immediately. Thereafter, the bottles were placed on an Innova 2100 orbital shaker at 100 rpm, and samples were taken every 1-2 hour for a total duration of 10 hour. The results (not shown) indicated that DNA mass 5 recovery from the PE bottles and PPCO bottles was statistically not significantly different (see paragraph on 'Assessing confidence intervals of DNA tracer concentrations' below), while mass recovery from the glass bottles significantly differed from the input mass. From this we concluded that both PE and PPCO bottles could be used as containers for our batch experiments. However, since PPCO bottles 10 have a known very low affinity for absorbing DNA particles (Gaillard and Strauss, 2000), we chose to carry out our batch experiments with PPCO bottles. Next, in a PPCO bottle with 100 ml of MilliQ water and 450 μl of a concentrated solution of T22 and T23, we tested a number of different sampling taking techniques: 1) sample collection in a 2.0 ml polypropylene syringe (BD Plastipak), 2) sample 15 collection in the same type of syringe, but sample disposal through a 0.22 µm cellulose acetate syringe filter (Membrane Solutions, part # SFCA013022N), 3) sample collection in the same type of syringe, but sample disposal through a 1.0 µm glass fiber syringe filter (Membrane Solutions, part # SFGF013100N), 4) sample collection in a 1 ml low adhesion polypropylene disposable gradient filtertip without 20 surface charge to assure no binding of charged molecules like DNA, and equipped with a Self Sealing Non Collapsing (SSNC) filter tip (Bioplastics, Landgraaf, The Netherlands) to avoid cross-contamination. In all 4 cases, storage of the sample was in 2.0 ml Eppendorf vials made of M-type polypropylene with less than 1% DNA adhesion (Bioplastics, Landgraaf, The Netherlands). The results (data not shown) 25 indicated that glass fiber filters retained a statistically significant fraction of the 2 synthetic DNA mass (T22 or T23), while cellulose acetate filters, known to release dissolved organic carbon into the sample (e.g. Khan and Subramania-Pillaib, 2007), apparently increased the efficiency of the qPCR reaction giving rise to recovered DNA masses (T22 or T23) well in excess of 100%. Best results were obtained when 5 using no filter, either with a syringe or a filtertip. We chose to use method 4, the filtertips, because of their known low DNA binding characteristics. Determining DNA tracer concentrations The PCR reaction itself takes place in a carefully defined mixture of PCR buffer, 10 primers, probe, and sample, up to a total volume of 20-50 µl in a so called 'well', which is then inserted into a qPCR apparatus. In our research, for each brook injection experiment, a standard curve was determined, as we expected that the quality of the brook water could have an effect on the performance of the enzymatic qPCR reaction, thereby affecting the determination of DNA concentrations. In addition, prior to each 15 experiment, of all brooks a water sample was taken and transported in a coolbox to our laboratory, where they were analysed for cations on an ICP-MS (Thermo Fisher Scientific XSeries 2). For each standard curve sample, the final mixture per qPCR well consisted of 2 µl 10X PCR buffer (Sigma-Aldrich product # P2317 buffer consisting of 100 mM 20 Trizma©-HCl, pH 8.3 at 25 °C, and 500 mM KCl) , 0.8 µl of PCR grade MgCl2 solution (Sigma-Aldrich product # M8787 with a concentration of 25 mM), 0.125 µl of forward primer and of reverse primer both at a final concentration of 62.5 nM, 0.3 µl probe at a final concentration of 150 nM, 1 µl Taq polymerase (Genscript product # E00007) corresponding to a final enzyme concentration of 0.5 U per PCR well, and 25 0.25 µl dATP (Genscript product # C01577) , 0.25 µl dGTP (product # C01578), 0.25 3 µl dCTP (product # C01579), and 0.25 µl dTTP (product # C01580) at a final concentration of 0.2 mM each. To this mixture 4.0 µl of sample with known DNA marker concentration, 4.0 µl of brook water and 6.65 µl of diethyl pyrocarbonate (DEPC) treated PCR grade water was added up to a total PCR well volume of 20 µl. 5 Samples from injection experiments were prepared in a similar fashion. The only difference was that instead of adding 4.0 µl of sample with known DNA concentration, 4.0 µl of brook water and 6.65 µl of diethyl pyrocarbonate (DEPC) treated PCR grade water, in this case 4.0 µl of sample from the injection experiment (with unknown DNA concentration), and 10.65 µl of DEPC treated PCR grade water 10 was used (again adding up to a total PCR well volume of 20 µl). All samples were subjected to an amplification protocol consisting of an initial denaturation step of 3 min at 95 °C followed by 40 cycles of denaturation and annealing/extension. The denaturation step was 95 °C for 15 seconds and the annealing/extension step was at 60 °C for 1 minute. 15 Standard curves were prepared covering a dynamic range of 6 to 7 10-fold serial dilutions corresponding to DNA marker concentrations ranging between 10-107 DNA particles/PCR well. All standard curves were prepared in duplicate. In addition, the standard curve data were used to determine the efficiency E [in %] of the PCR reaction at various concentrations as: 20 1 πΈ = (10− π − 1) β 100 (S1) whereby π was defined as the slope of the fitted linear curve between (known) concentrations [expressed as the logarithm of the number of DNA marker particles 4 per µl in a PCR well] and threshold cycle. Furthermore, the standard curve was used to assess the occurrence of inhibition of the polymerase enzyme due to the presence of inhibiting compounds in the brooks we used for the injection experiments. These inhibiting compounds may be humic substances and/or complex polysaccharides, 5 usually present in brook waters. Generally, when inhibition occurs, owing to the large dynamic range of qPCR, samples can simply be diluted, usually between 10-100 times, in order to 'dilute away' inhibitors and to generate high quality amplification profiles with adequate efficiency. To each standard curve, a so called no template control (NTC) was added. The purpose of the NTC was to check for pollution during 10 the process of preparing the mixture for PCR analysis and to assess the importance of random amplification and the formation of primer-dimers. Of course, for completely controlled assays, random amplification should be absent, but since we worked with environmental waters from which we never knew exactly the composition, some random amplification occurred, and therefore, the lowest reliable detection threshold 15 cycle had to be determined. In addition to determining the qPCR efficiency from the slope of the log-transformed standard curve, we also determined the efficiency of each individual amplification curve, πΈπ , for each sample by using LinRegPCR (Ramakers et al., 2003; Ruijter et al., 2009) according to: 20 π = π0 (πΈπ )πΆ (S2) where π is the number of target molecules or amplicons at the end of cycle number πΆ, and π0 is the number of amplicons at the beginning of the reaction. In this equation, 25 the efficiency πΈπ of the reaction is a number between 1 and 2. LinRegPCR determines 5 a baseline fluorescence and does a baseline subtraction. Then a so-called Window-ofLinearity is set and PCR efficiencies per sample are calculated from the slope of each individual amplification curve. The reason for using LinRegPCR was that we wanted to use the sample specific qPCR efficiency as an additional check to evaluate the 5 goodness of the design of the DNA markers, since a well designed DNA marker should have similar qPCR efficiencies across the entire range of concentrations. All pipetting actions were carried out with Gilson Pipetmans (P10, P20, P100, P200) with the same disposable gradient filtertips as described above. Storage of all samples was in 1.5 ml Eppendorf vials made of M-type polypropylene with less than 1% DNA 10 adhesion (Bioplastics, Landgraaf, The Netherlands). The preparation of qPCR mix (all compounds as described above, except sample) was carried out in one DNA workstation (Kisker L020-GC), while the preparation of the qPCR wells was carried out in another -similar- DNA workstation, located adjacent to the first one. 15 Assessing confidence intervals of DNA tracer concentrations Carrying out a qPCR analysis requires a number of pipetting actions of volumes ranging from 4-500 µl. The preparation of the qPCR mix (see above) requires the pipetting of 8 different compounds, and the preparation of a standard curve requires the dilution of the original, purchased vial of synthetic DNA by pipetting 10 µl into 90 20 µl of TE buffer (pH = 8.0), and the 13 times serial repetition of this process in order to arrive at ultra-diluted DNA concentrations consisting of 50 or less DNA particles per well. Then, 4 µl of sample plus 16 µl of PCR mix is finally pipetted into a PCR well. So, for the preparation of the most dilute DNA marker concentration, a total of 26 (= 2 x each serial dilution) + 8 (preparation of qPCR mix) + 2 (adding qPCR mix and 25 sample to a qPCR well) = 36 pipetting actions are required! Depending on the 6 accuracy and precision of the pipet and pipetting system used, this gives rise to errors in the resulting DNA concentration, and it is important to quantify these errors in order to determine realistic confidence intervals of the DNA marker concentrations. We identified 2 types of errors: 1. Serial dilution errors, or, perhaps more accurately, 5 the propagation of the error in serial dilutions, and 2. Random errors due to differences in performance of the qPCR apparatus and/or due to minor variations in the preparation of the qPCR mix or in adding qPCR mix and sample to a qPCR well. Serial dilution errors 10 The serial dilution error was determined by following Hedges (2002). Briefly, the serial dilution error consists of two parts: 1. Pipetting errors and 2. Poisson sampling π π’π errors. With regard to the pipetting errors, for a dilution π· = π ≡ (π’+π£)π , whereby π’ π’ is a volume pipetted in volume π£ to make a dilution of π’+π£, and whereby π is the step number in this serially repeated process, the variances of π, π, and π·are: 15 20 πππ(π) ≡ πππ(π’π ) ≅ π β π’2(π−1) {πππ(π’) + (π − 1) β πΆππ£ππ(π’π , π’π )} (S3) πππ(π) ≡ πππ(π€ π ) ≅ π β π€ 2(π−1) {πππ(π€) + (π − 1) β πΆππ£ππ(π€π , π€π )} (S4) π πππ(π·) ≡ πππ(π ) ≅ π·2 { πππ(π) π2 + πππ(π) π2 − 2πΆππ£ππ(π,π) ππ } (S5) whereby π€ = π’ + π£, and indices π and π are consecutive dilutions. πππ(π’) and πππ(π€) were determined from the (in)accuracy and (im)precision of the pipet, both of 7 which we determined for each Gilson Pipetman used in the preparation of the qPCR samples. Hedges proofs that the pipetting error, πππ(π), can be approximated by: πππ(π) ≅ (π’′ )2 β πππ(π·) + π·2 β πππ(π’′ ) (S6) 5 whereby π’′ is the sample volume delivered to the qPCR well. In addition to this serial dilution variance, due to the randomly distributed nature of DNA marker particles in each sampling vial, there is a Poisson sampling error, π, determined as: 10 1 πππ(π) = π + (π’′)2 β π ∗ β ∑π=2π π=π+1{π§ π } (S7) whereby π is simply the observed DNA marker concentration, π ∗ is the estimate of the unknown original DNA marker concentration, and 1/π§ π is the size of the dilution step. The first part on the right hand side of eq. (S7) is the error due to the final 15 sampling to be delivered to the qPCR well, and the second part on the right hand side of eq. (S7) is the error due to the cumulative sampling during the making of the dilution series. Finally, the confidence interval of a DNA marker concentration at a 95% confidence level due to serial dilution errors can then be determined with: 20 1 1 πΆ. πΌ. = 1.96 β √πππ(π) β (π·)2 β (π’′)2 (S8) whereby: πππ(π) = πππ(π) + πππ(π) (S9) 8 Random errors Despite serial dilution errors, identical DNA marker samples should yield identical threshold cycles, but it is our experience that they do not. In order to quantify the 5 accuracy of the qPCR apparatus, we determined the threshold cycle of 48 identical samples, and used the standard deviation as a measure for the precision of the instrument. Finally, in order to quantify the magnitude of the random variation in threshold cycle from sample taking to qPCR analysis, we prepared a PPCO bottle with 100 ml of MilliQ water in which we injected 10 µl of concentrated DNA marker 10 solution, and collected duplicate samples every 2 hour for a total duration of 10 hour. For this case, we used the standard deviation of the DNA marker concentrations as a measure to quantify the random errors in the entire process from sample taking to qPCR analysis. 15 Results Assessing confidence intervals of DNA marker concentrations The (in)accuracy and (im)precision of all pipetting equipment used in our laboratory was around 2% on a routine basis. With a final sample volume in a PCR well of 4 µl, the confidence interval from the 1st 10 fold dilution to the 12th 10 fold dilution 20 increased from 5 to almost 40% of the initial undiluted concentration (Fig. S1). Since the standard curve used 10 fold dilutions from step 6 to step 12, meaning that we used 106 to 1012 times serially diluted initial concentration, the confidence interval associated with the standard curve ranged from 19% to almost 40% of the initial undiluted concentration. Furthermore, the standard deviation of the qPCR apparatus 25 of 48 identical samples was 0.9% (n=48) of the threshold cycle. Since DNA is 9 multiplied every cycle, a standard deviation of ~1% equalled ~0.1-0.3 cycle, assuming an amplification protocol of 10-30 cycles. This was equivalent to a DNA mass standard deviation of 20.1-20.3 or 7-23% of the mean measured concentration. The standard deviation of 24 samples taken from an identical batch experiment with 5 MilliQ water in order to quantify the random error in the entire process from sample taking to qPCR analysis, was 14% of the input concentration. During the tracer injection experiments, we used a 100 times (10·10 fold) serially diluted sample taken from the PPCO container with concentrated T23 mixed with 100 ml MilliQ water we injected in the brook (see Methods section). The threshold cycle 10 of these samples was around between 10 and 14, while threshold cycles of DNA marker samples from the brook at measurement point 1 and 2 were between 20 to 26. A cycle difference of 12 corresponded to a 4-step 10-fold serial dilution of standard curve samples. So, from DNA input mass at the point of injection to measured DNA concentration in the brook at the points of measurement involved 6 steps in a 10-fold 15 serial dilution series, which corresponded to a confidence interval of 19.2% of the input concentration. Together with a random error of 14% of the input concentration, in our experiments, this gave a confidence interval of ~33% of the input concentration at a 95% confidence level of a T23 concentration determined in a sample taken from the field, and such confidence intervals also applied to the batch experiments. 20 Amplification efficiencies and standard curve analysis The Maisbich had a relatively low EC-value (125-135 µS/cm), and low concentrations of most of the cations (Table S1). The Heuwelerbach had the highest EC-value (412 µS/cm with a relatively high calcium concentration. The Gloop, Bloop, and Ditch had 25 EC-values between 269 and 382 µS/cm, with similar cation concentrations, with, 10 compared to the Luxembourg sites, more Fe and K. Also, the Al concentration in the Ditch was rather high. Furthermore, in the Gloop and Bloop humic compounds were present, evidenced by their beer-like appearance. In these latter two streams, none of the DNA marker material in the positive standard curve samples was amplified, 5 indicating that inhibition of the PCR reaction occurred. For the Gloop, diluting the samples 4-fold was sufficient to eliminate inhibition, while a 10-fold dilution of the Bloop samples was sufficient to yield good quality amplification curves. For the other brooks, no inhibition occurred. We think the inhibition was due to the presence of humic compounds mentioned above. The standard curves of T22 and T23 (Fig. S2), 10 including the diluted Gloop and Bloop standard curves, were similar for all brooks. We carried out an analysis of covariances (ANCOVA) of the log-transformed concentrations versus the threshold cycle, and found that none of the standard curves were statistically different from each other at a significance level of 5%. The standard curve efficiency of marker T22 was between 108.4 and 119.4% with an average of 15 113.7%, while for T23, the standard curve efficiency was between 89.4 and 103.8% with an average of 93.0%. Efficiencies in excess of 100% are somewhat confusing, and actually indicate that the standard curve is too steep, and that amplification efficiencies vary for the different standard curve samples. In fact, the individual amplification efficiencies of T22, determined with LinRegPCR (Fig. S3), ranged 20 between 1.61 and 2.48, while of T23 the individual efficiencies ranged between 1.721.77, and were almost constant over the entire concentration range of the PCR assay. For marker T22, we also noted random amplification of no template controls at around 30 cycles of amplification, while for T23 amplification of no template controls did not occur before a threshold cycle 35 or more. Based on these two assay 25 performance criteria, we concluded that T23 was apparently a better designed DNA 11 marker than T22, although both markers were designed according to the same design considerations. In order to distinguish positive samples from negative ones, we applied a difference of 3-5 cycles between the lowest detectable positive sample and random amplification of a negative control sample. Therefore, for marker T22, 5 injection experiments were considered to be positive when the threshold cycle was below 25.3-26.9, dependent on the brook. For marker T23, injection experiment samples were considered to be positive when the threshold cycle was below 31.832.8, dependent on the brook. The T23 concentration belonging to this threshold cycle was 200 particles in a 20 µl PCR well (Fig. S2), so the lowest possible concentration 10 T23 we could measure in a 4 µl sample was (200/5)·106 = 5·107 particles/L. For Gloop and Bloop, this minimum measurable concentration had to be multiplied with a factor of 4 and 10 to account for the dilution that was required to eliminate inhibiting effects. 15 References Gaillard, C., and F. Strauss, 2000. Eliminating DNA loss and denaturation during storage in plastic microtubes. American Biotechnology Laboratory, P. 24, Vol. 18, Issue 13. Hedges, A.J., 2002. Estimating the precision of serial dilutions and viable bacterial 20 counts. International Journal of Food Microbiology 76 (2002) 207– 214. Khan, E., and S. Subramania-Pillaib, 2007. Interferences contributed by leaching from filters on measurements of collective organic constituents. Water Research, p. 1841-1850, doi: 10.1016/j.watres.2006.12.028. Ramakers, C., Ruijter, J.M., Lekanne Deprez, R.H., and A. F.M. Moorman, 2003. 25 Assumption-free analysis of quantitative real-time polymerase chain reaction 12 (PCR) data. Neuroscience Letters 339 (2003) 62–66, doi:10.1016/S03043940(02)01423-4 Ruijter, J. M., Ramakers, C., Hoogaars, W. M. H., Karlen, Y., Bakker, O., Van den Hoff, M. J. B., and A. F. M. Moorman, 2009. Amplification efficiency: linking 5 baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Research, 2009, Vol. 37, No. 6, Doi:10.1093/nar/gkp045. 13 Tables Table S1: Cation concentrations (in mg/L) of the brook waters used in this research Stream 5 EC (mS/cm) Na Ca Mg Al K Fe Maisbich downstream 125 11.5 8.5 4.4 < 0.2 4.0 < 0.2 0.2 - Maisbich upstream 135 15.7 16.1 5.1 0.3 9.6 0.3 0.4 - Heuwelerbach 412 2.9 78.7 1.9 0.7 1.1 0.3 0.3 - Ditch 382 13.2 31.2 7.5 2.8 18.8 1.9 0.2 - Gelsloopken 345 14.3 31.0 5.2 0.6 17.5 1.3 0.3 yes Biezenloop 269 12.6 20.3 5.2 0.7 12.1 0.6 0.2 yes 14 Zn Humics Figures C.I./ N* (-) 0.4 0.2 0 0 Figure S1: 5 5 10 n-th 10-fold serial dilution (-) 15 Ratio of confidence interval (C.I) at 95% confidence level and undiluted concentration (N*) as a function of 10-fold serial dilutions for a pipetting accuracy and (im)precision of 2%, and a final sample volume of 4 µl per qPCR well 15 Log (part/ml) 8 7 6 5 4 3 2 1 0 milliQ Heuwelerbach Maisbich upstream Ditch Gloop 4x Bloop 10x Log (part/ml) 10 20 25 Threshold cycle 8 7 6 5 4 3 2 1 0 30 35 milliQ Heuwelerbach Maisbich upstream Ditch Gloop 4x Bloop 10x 10 Figure S2: 5 15 15 20 25 Threshold cycle 30 35 Standard curves of tracer T22 (above) and T23 (below) for the various surface waters used in this research. Threshold cycles were determined with the software of the qPCR device, which is identical to the threshold cycle algorithm developed by Zhao and Fernald (2005). Data on the y-axis are expressed as the logarithm of the number of DNA marker particles per µl in a PCR well, so a value of 1at a threshold cycle of 32.5 in case of T23 10 corresponds to 10 DNA particles per µl or 200 particles in a 20 µl PCR well, and to (200/4)·106= 5·107 DNA particles per L of sample, since 4 µl of sample was used per PCR well 16 2.5 T23 T22 Ei 2.0 1.5 1.0 10 Figure S3: 20 Threshold cycle 30 Individual amplification efficiencies, πΈπ , determined with LinRegPCR as function of threshold cycle 5 17