Statistics of ddPCR_v1DEC3

advertisement
Measurements and Statistics of ddPCR Experiments
Version 1.0, released Dec 3, 2012
Presentation Overview
This presentation answers the following questions:
How does QuantaSoft calculate target concentration?
Why is amount of sample loaded critical to accurate ddPCR
measurements?
What do the two types of error bars presented in QuantaSoft
mean, and which ones do I use?
Technical replicates on
the QX100 at a range of
concentrations
Key Points About Droplet Digital PCR
Each droplet is an isolated reaction vessel
All PCR reagents are contained in each droplet volume
In any given droplet, target may or may not be present
Readout:
Number of droplets with target (positives)
Number of droplets without target (negatives)
This is an endpoint assay
Only +/- matters.
Consistent PCR efficiency is less important than it is in qPCR
Generate
droplets
PCR
Count with
droplet reader
# positive:
# of negatives:
Copies Per Droplet
CPD = Copies of target Per Droplet
Units are of # per unit volume, not mass per unit
volume
CPD is the average number of copies per
droplet. For a CPD of 2, some droplets will
have 0 copies, some will have 1, 2, 3, 4, etc.
Multiple ways to calculate CPD:
Example 1:
100,000 molecules total, 20,000 droplets:
CPD = (100, 000 molecules) / (20, 000 droplets)
CPD = 5 molecules / droplet
Total number of molecules
CPD =
Total number of droplets
Example 2:
Molecules
CPD =
• droplet volume (µ l)
µl
20 molecules in a 20 µ l sample, 1 nl droplet volume:
20 molecules/20 µ l = 1.0 molecule/µ l
CPD = (1.0 molecule / µl)(0.001 µl / droplet)
CPD = 0.001
Low CPD Quantitation: Each Droplet Contains 1 Target
Before PCR, each droplet
contains at most 1 target.
This is traditional “limiting
dilution.”
6 copies of target in 20
l sample
(0.3 copies/ul)
Likely outcome of ddPCR:
6 positives
μ
Simple formula for low-concentration case only (1 nl droplet volume):
N pos
*1000 = copies / µl
N total
Npos = # of droplets with template
Ntotal = Total # of droplets
Intermediate CPD: Some Droplets with > 1 Target
Example: 5,000 targets in 20,000 droplets
Note: 5,000 targets in 20
μl = 250 copies per μl = 0.25 CPD
Observations:
We might expect 25% of the droplets to contain 1 copy of target and 75% of the
droplets to contain no copies of target, but in reality, we’ll see some droplets with 2, 3,
or even 4 copies, and correspondingly more droplets with 0 copies
Statistics tells us exactly what to expect (on average):
0 targets
78% of droplets
~
1 target
19.5% of droplets
~
~
2 targets
2.4% of droplets
~
~
~
3 targets
0.2% of droplets
~
~
~
~
4 targets
0.01% of droplets
“negative”: 78% (not 75%)
“positive”: 22% (not 25%)
Droplets that start out with 1,2,3, or
4 targets all look the same after
thermal cycling – PCR saturates
High CPD: More Target Molecules Than Droplets
The average number of
molecules per droplet will be
2.5.
0
1642
8.21%
1
4104
20.5%
2
5130
25.7%
3
4275
21.4%
4
2672
13.4%
5
1336
6.68%
6
557
2.78%
7
199
0.99%
8
62
0.31%
9
17
0.086%
10
4
0.022%
11
1
0.0049%
Droplet occupancy, 2.5 CPD
5000
Percent
of total
droplets
3000
Count of
droplets
0 1000
μ
μ
50,000 targets in 20 l
= 2500 targets per l
= 2.5 CPD
# of target
molecules
# of droplets
Example
0
But there will still be many
droplets with 0 molecules
1
2
3
4
5
6
7
8
9 10
# target molecules
When we package 50,000
molecules into 20,000 droplets,
on average 1642 droplets will
have 0 targets, 4104 droplets
will have 1 target, 5130 droplets
will have 2 targets, etc.
Average Number of Empty Droplets Changes with CPD
0
1
2
3
4
# of target molecules in droplet
2.5 CPD (data from table
on previous slide)
5000
18,357 occupied
droplets expected“positives”
0
# of droplets
15000
5000
12,642 occupied
droplets expected“positives”
0
# of droplets
15000
5000
0
# of droplets
4,425 occupied
droplets expected“positives”
15000
1 CPD
0.25 CPD
0
1
2
3
4
5
6
7
# of target molecules in droplet
0
1
2
3
4
5
6
7
# target molecules
15,576 empty
droplets expected“negatives”
7,358 empty
droplets expected“negatives”
8
9
# of target molecules in droplet
1,642 empty
droplets expected“negatives”
We calculate CPD based on the number of empty droplets observed.
If CPD is Too High, There Are Not Enough Negative
Droplets for Quantitation
0
3
6
9 12
16
20
24
# target molecules
7 empty droplets
expected“negatives”
28
2000
1000
0
1000
# of droplets
2000
15 CPD
0
1000
# of droplets
2000
10 CPD
0
# of droplets
8 CPD
0
3
6
9 12
16
20
24
28
0
# target molecules
1 empty droplets
expected“negatives”
3
6
9 12
16
20
24
28
# target molecules
0 empty droplets
expected“negatives”
Formulas Used in ddPCR
C = CPD
E = observed fraction of droplets that are empty
Vdroplet = Volume of droplet
c n e−c
Pr(n) =
n!
Poisson distribution: probability that a droplet will
contain n copies of target if the mean # of target
copies per droplet is c.
Pr(0) = e−c
Poisson distribution, n=0. Probability that a
droplet will be empty for a given value of c.
c = −ln(E)
Best estimate of CPD given the fraction of
observed droplets that are empty.
conc =
c
Vdroplet
Concentration of sample
ddPCR Confidence Intervals
Two types of errors are reported by QuantaSoft
Poisson errors: calculated for a single well or merged well,
with contributions from subsampling and partitioning
Total errors: calculated for replicates
Experiments Involve Subsampling
In most molecular biology experiments, we analyze part of a
whole (a subsample)
Examples:
a sample of blood
a biopsy from a tumor
an aliquot from a tube of DNA
Whenever you subsample from a larger volume, there is a
subsampling error
Subsampling error is most significant at low concentrations
Subsampling At Low Concentrations
No subsampling – analyze entire volume of sample
Perfect
counting
machine
Count molecules in entire volume: no
subsampling error.
Subsampling error – analyze part of a sample
Expect 6 molecules.
Measure 5 molecules.
Most of the 25 subsamples contain 4,
5, 6, 7, or 8 molecules – this
uncertainty is what we mean by
subsampling error. This uncertainty
contributes to the “Poisson error
bars.”
Perfect
counting
machine
150 molecules in sample.
Subsample 1/25
(6 molecules expected)
Subsampling Error is Inevitable
M= expected number of target
molecules in ddPCR reaction
Fundamental subsampling limits
Subsampling example: Suppose a person
has a total of 100,000 copies of a particular
target in his blood (5 liters total volume) and
you take 5 ml of plasma, extract DNA, and
run ddPCR. On average, you will find 100
copies of target, but the standard deviation of
this measurement is 10 and the CV is 10%.
CV %
M
M
2 4 6 8
CV =
12
stdev = M
0
1000
3000
5000
Expected # of items
When subsampling from a large volume, these are absolute limits on measurement error.
You cannot do better when measuring error properly.
Errors Bars at High Concentration: Effect of
Partitioning into Droplets
Start with a sample with exactly 288 target molecules. Partition into 144 droplets (2 CPD)
Empty droplets: 22 (19 expected)
Calculated concentration: 1.88 CPD
Each small
square represents
a 1 nL droplet
Empty
droplet
Occupied
droplets
Repeat 3 times. Partitioning of 144 molecules will be a little different every time
This uncertainty
contributes to the “Poisson
error bars.”
Empty: 20
Est Conc: 1.97 CPD
Empty: 17
Est Conc: 2.14 CPD
Empty: 19
Est Conc: 2.03 CPD
Error Bars at High Concentration
Relative contribution of partitioning error
and subsampling error to ddPCR error
5
At high CPD, uncertainty due to
partitioning is higher than
uncertainty due to subsampling.
stdev
⋅100
mean
4
1
2
CV(%)
CV is standard
deviation expressed as
a percentage.
Partitioning error
Subsampling
error
0.11
0
CV =
3
Dotted lines show CPD range
with CV < 2.5%. The lowest CV
occurs at a CPD of ~1.6.
Example: 3 CPD
•CV = 1.19 % (assuming 15,000
droplets read)
•95% CI = 2.93-3.07
ddPCR error (15,000 droplets)
Subsampling error
0
1
2
3
CPD
4
5.73
5
6
Error Bars at Low Concentration
10
Errors at low concentration
8
Subsampling error
ddPCR error (15,000 droplets)
0
2
4
CV(%)
6
At low concentration, the
largest contribution to the
error is from subsampling.
Partitioning a given
sample into more droplets
will not change it.
0
5000
10000
15000
Target molecules per 20 ul
Range is 0.0067 to 1 CPD, or 100 to 15,000 copies of target in the sample (15,000 droplets)
“Technical Replicates” Describes Multiple Different
Experimental Designs
Sample
•
•
Posttreatment
DNA
DNA for
ddPCR
DNA +
MMX
ddPCR
well
Poisson confidence intervals (CIs) reported by QuantaSoft capture the 1st example.
CIs can be calculated without replicates
The 2nd and 3rd examples show why total CIs might be larger than Poisson CIs –
additional variability from entire process is often larger than measurement error.
Poisson Error Bars Estimate Errors on Pure
Technical Replicates
qPCR
1
Estimate of the mean
and CI based on
technical replicates
Relative
concentration
Relative
concentration
Mean and CI
This type of error
is captured by
Poisson CI in
ddPCR
1
Sample
Sample
ddPCR
Calculate mean and CI
based on known statistical
properties of digital
observations
1
ddPCR
Copies per
microliter
Droplets from
replicates pooled
in metawell
1
Treat all three wells together
as one big well, and calculate
CI as for 1 well.
Estimate mean and CI based
on replicates, to account for
additional types of variability.
1
Poisson CI
Copies per
microliter
Copies per
microliter
Mean and CI
Inner error bars show CIs
based on fundamental
ddPCR statistics (Poisson
CI)
Outer error bars (total CI)
include all the observed
variability that is not
accounted for by
fundamental ddPCR
statistics.
Biological Replicates – qPCR and ddPCR
qPCR: multiple wells required to estimate measurement error
ddPCR: one well is sufficient to estimate the measurement error
* Multiple wells per sample recommended for extremely low concentrations
Error Bar “Rules of Thumb”
Total error bars always greater than or equal to Poisson error bars
Enforced by QuantaSoft, it is not a fundamental property of the math
Total errors bars will be approximately equal to Poisson error bars
for true instrument technical replicates with a good assay and good
technique.
If in doubt, report total error bars as 95% confidence interval (CI)
For downstream analysis, stdev = (CImax- CImin)/(2*1.96)
Summary
In ddPCR, we determine concentration by effectively counting
target molecules
We estimate measurement error in two ways
Based on fundamental statistics
Based on technical replicates combined with fundamental statistics
At low concentrations, subsampling error is a fundamental
limitation of any measurement technique
Units
Term
Definition
Typical range
Significance
Copies/ l
number of target molecules
per l.
1-6000
QuantaSoft reports
concentration in these
units.
copies/drople
t (CPD)
Number of target molecules
per droplet
0.001-6
To ensure some empty
droplets, load less than 6
CPD.
Genome
equivalents
(GE)
Approximate number of
human genomes present. 1
diploid cell contains 2 GEs of
DNA. 1 GE = 3.3 pg.
Depends
For targets present at 1
copy per genome, load <=
6 GE per droplet.
μ
μ
Notes:
•A 20 l sample is partitioned into 20,000 droplets
•For a 1 nl droplet, 1 CPD = 1 copy/nL = 1000 copies/uL (QuantSoft units)
μ
Download