Here we provide details of the derivation of the two main theoretical

advertisement
Supplementary materials
Here we provide details of the derivation of the two main theoretical results of the paper.
The first theoretical result describes the difference curve obtained by subtracting the melting curves
from samples of different genotypes mixed with homozygous reference DNA fraction x of the total
DNA.
1) This curve is the given by the difference in heteroduplex fractions of the mixtures times a
weighted average of the homoduplex curves minus the mean of the heteroduplex curves.
2) When the melting curves of wild type and mutant homozygotes are indistinguishable, the
weighted average of the homoduplex curves is just this common homozygote curve. In this case
the reference DNA fraction which maximizes separation of the difference curves of mixtures with
all three genotypes of bi-allelic diploid DNA is x= 17 .

Samples of different genotypes are designated W for wild type, M for homozygous mutant, and H
for heterozygous mutant; forward and reverse strands of wild type amplicon duplexes are
designated w and w’ respectively, and forward and reverse strands of homozygous mutant
amplicon duplexes are designated m and m’ respectively. We will assume that the genotype of the
reference DNA is wild type, but clearly the role of wild type and homozygous mutant are arbitrary
here and may be reversed.
We assume amplification preserves the relative proportion of all strand species. Thus, Table 1a
models both the initial and post-extension relative concentrations of reference wild type and
homozygous mutant duplexes when samples of the three genotypes mixed with homozygous
reference DNA fraction x of the total DNA so that the sample fraction is 1-x. The result is trivial
when the sample is of the same or opposite homozygous genotype, and when it is heterozygous,
half of the sample contributes to the wild type duplex fraction and the other half contributes to the
mutant homozygous duplex fraction.
Table 1a. Homoduplex fractions at extension phase of PCR
Genotype
[ww’]
[mm’]
W
1
0
M
x
1-x
H
x + 12 (1-x)=
1
2
(1+x)
1
2
(1-x)

 reannealing step prior to 
Next we assume that a final denaturation
and
melting promotes the
formation of heteroduplexes independent of whether they are perfectly complementary
([ww’],[mm’]) or just approximately complementary ([wm’],[mw’]). Mathematically, this says that
the resulting fraction of each of these duplexes is the product of the individual strand
concentrations given by their duplex concentrations in Table 1a. Rows of Table 1b is then obtained
by multiplying the expressions in Table 1a in Binomial (`foil’) form much like a Punnett square.
Since the sums of rows of Table 1a equal 1, so do the sums of rows of Table 1b: When a+b=1,
(a+b)(a+b)=aa+ab+ba+bb=1.
Table 1b. Duplex fractions after denaturation and reannealing of PCR product
Genotype
[ww’]
[wm’]
[mw’]
[mm’]
W
1
0
0
0
M
x2
x(1-x)
x(1-x)
(1-x)2
H
1
4
(1+x)2
1
4
(1-x2)
1
4
(1-x2)
1
4
(1-x)2




Our next assumption is that the melting curve of a mixture of duplexes is given by the
weighted sum of the individual duplex melting curves in proportion to their relative
concentrations. This makes sense since individual duplex melting curves already take into
account all fluorescence differences due to double vs. single stranded state of the duplex at a
given temperature as well as potentially reduced fluorescence of double stranded
heteroduplex bubbles. Identifying the duplex fluorescence vs. temperature melting curves by
their two strands, Fww’, Fwm’, Fmw’, Fmm’, and weighting them by the duplex concentrations
from Table 1b gives Table 1c, the theoretical melting curves of mixtures of reference DNA
fraction x with the various genotypes. Note that when x=0, these expressions reduce to the
wild type and mutant homozygote melting curves, and the heterozygote curve given by the
equally weighted sum of one-quarter of each duplex, two homoduplexes and two
heteroduplexes, so its overall heteroduplex content is one-half. We may interchange the words
homozygote and homoduplex when referring to their melting curves, but not heterozygote
and heteroduplex.
Table 1c. Melting curves of mixtures
Genotype
Melting Curve
W
1.0Fww’(T)
M
x2Fww’ (T) + x(1-x)Fwm’ (T) + x(1-x)Fmw’(T) + (1-x)2F mm’(T)
1
4
H
(1+x)2Fww’ (T) +
1
4
(1-x2)Fwm’ (T) +
1
4
(1-x2)Fmw’(T) +
1
4
(1-x)2Fmm’(T)




Subtracting the melting curves in the lower two rows of Table 1c from the melting curve in
the upper row in Table 1d gives the theoretical difference curves between mixtures with wild
type and mutant homozygous samples, and between mixtures with wild type and
heterozygous samples, respectively. The difference curve between mixtures with mutant
homozygous and heterozygous samples will be obtained as the difference of these differences
after simplification. Note that the sum of the coefficients of melting curves in each row of
Table 1d is zero, since we have subtracted rows with coefficients adding to one.
Table 1d. Between mixture difference curves
Genotypes
Difference Curve
W-M
(1-x2) 2Fww’(T) - x(1-x)Fwm’(T) - x(1-x)Fmw’(T) - (1-x)2Fmm’(T)
(1- 14 (1+x)2) 2Fww’(T) -
W-H


1
4
(1-x2)Fwm’(T) -

1
4
(1-x2)Fmw’(T) - 14 (1-x)2Fmm’(T)

In Table 1e, we factor out heteroduplex content from the expressions in each row of Table 1d.
Since the coefficients of the heteroduplex curves Fwm’, Fmw’ in Table 1d are the same, both equal to
minus one-half of the total heteroduplex content, the coefficients of Fwm’, Fmw’ in parentheses in
Table 1e are both - 12 . Since the coefficients of Fww’, Fmm’ in Table 1d are positive, and the sum of
homoduplex and heteroduplex curve coefficients is zero, the coefficients in parentheses of Fww’,
 1e must be positive and sum to +1.
Fmm’ in Table
Table 1e. Factored form of difference curves.
Genotypes
Difference Curve (Factored Form)
W-M
2x(1-x)[(mww’Fww’(T)+ mmm’Fmm’(T)) - 12 (Fwm’(T)+Fmw’(T))]
W-H
1
2
(1-x2)[ [(hww’Fww’(T)+ hmm’Fmm’(T)) - 12 (Fwm’(T)+Fmw’(T))]

where mww’(x)  0 ,mmm’(x)  0, mww’(x) + mmm’(x) =1, hww’
(x)  0 , hmm’(x)  0, hww’(x) + hmm’(x) =1.
These are the conditions for coefficients of weighted average, sometimes called a convex

 interpolation. Table 1e says that
 differencecurves are described by the total
combination
or linear
heteroduplex content times a weighted average of the homoduplex curves minus the mean
heteroduplex curve.
Since we are most interested in the situation when it is difficult to distinguish the
homozygous mutant melting curve from the wild type melting curve, we focus first on the
case when they are taken to be identical. In situations where nearest-neighbor
thermodynamics predicts that the melting curves of two homoduplexes are the same, it does
not imply that the melting curves corresponding to heteroduplexes are the same, in fact that
would be an unlikely coincidence. And though the relative concentrations of the two
heteroduplexes in our these mixtures is always equal, the relative concentrations of the
homoduplexes is not. What is unique to the identical homozygote situation is, with respect to
melting and difference curves, all weighting factors have the same result: When
Fww’(T)=Fmm’(T) then aFww’(T)+bFmm’(T) is the same for any a and b with a+b=1. This allows us to
replace each of the weighting factors in Table 1e, mww’, mmm’ , hww’ , hmm’ by 12 , and completely
separate the heteroduplex dependence from the temperature dependence of the difference curves.
 curve difference is given by
Table 1f shows that in the case of identical homozygotes, the melting
the heteroduplex content difference between mixtures times the mean homoduplex curve minus the
mean heteroduplex curve.
Table 1f. Difference curves when homozygote curves are equal, Fww’(T)= Fmm’(T)
Genotypes
Difference Curve When Fww’(T)= Fmm’(T)
W-M
2x(1-x)[ 12 (Fww’(T)+Fmm’(T)) - 12 (Fwm’(T)+Fmw’(T))] = m(x)F(T)
1
2
W-H
(1-x2)[ 12 (Fww’(T)+Fmm’(T)) - 12 (Fwm’(T)+Fmw’(T))] = h(x)F(T)


( 32 x2-2x+ 12 )[ 12 (Fww’(T)+Fmm’(T)) - 12 (Fwm’(T)+Fmw’(T))] = (h(x)-m(x))F(T)
H-M





 heteroduplex content
 of the mixture with a mutant homozygous sample,
Here m(x)=2x(1-x)
is the
h(x)=2x(1-x) is the heteroduplex content of the mixture with a heterozygous sample, and F(T) =
1
2
(Fww’(T)+Fmm’(T)) - 12 (Fwm’(T)+Fmw’(T)) is the difference of the mean homoduplex curve and the
mean heteroduplex curve. Since there is no heteroduplex content in the mixture with wild type,

 are also the difference in heteroduplex content between their respective mixtures and
m(x) and h(x)
wild type. We have also obtained the expression for the difference curve between mixtures of
reference DNA with homozygous mutant and with heterozygous samples by subtracting the
differences of each and wild type: H-M=(W-M)-(W-H).
In the case of identical homozygote melting curves, described by Table 1f, we can explicitly
determine the reference DNA fraction x which maximizes the separation between melting curves
corresponding to mixtures of that fraction with different genotypes. This is because the separation
between any such pair of curves is proportional to the magnitude of the heteroduplex content
difference of between the mixtures, which we have computed for all three pairs as m(x), h(x), and
|h(x)-m(x)|, which are plotted in Fig. 3. Any quantitative measure of separation will be propotional
to the appropriate one of these functions, such as area between curves or their maximum
separation. For the sake of definiteness, we will measure the ability to distinguish all genotypes will
be determined by the maximum separation between the closest pair, so the separation between a
pair of mixtures is given by their heteroduplex content difference times the maximum value of
F(T), which we will call F. For example, when the reference DNA fraction x=0 and samples are
unmixed, wild type and homozygous mutant samples have zero heteroduplex content equal to 0,
and heterozygous samples have heteroduplex content equal to 12 . The maximum difference between
either homozygous curve and the heterozygous curve is 12 F, but the maximum difference between

the two homozygous curves is zero. To maximize our ability to distinguish all three genotypes, we
seek the value of the reference DNA fraction xwhich maximizes the smallest of the three absolute
heteroduplex content difference functions. To find this value, we first place the three absolute
heteroduplex content difference functions, in increasing order depending on the values of x on
intervals where the order differs.
Table 1f. Order of absolute heteroduplex content difference on intervals where it differs.



Interval
Smallest, s(x)
Middle
Largest
0<x< 17
m(x)
h(x)-m(x)
h(x)
1
7
<x< 13
h(x)-m(x)
m(x)
h(x)
1
3
<x<1
m(x)-h(x)
h(x)
m(x)

The first column gives the function s(x) we wish to maximize across the full interval of reference
DNA fractions x. In Fig. 3, this is the union of the lowest graphs on each interval, m(x) from
reference DNA fractions x=0 to 17 , h(x)-m(x) from x= 17 to x= 13 , and m(x)-h(x) from x= 13 to x=1.
A theorem from calculus confirms our visual intuition that the maximum of s(x) can only occur

 not have
 a well-defined tangent:If s(x) has a local
where the slope of its tangent
is zero, or it does
extremum at x=a, then s'(a)=0 or s'(a) does not exist. The only place s'(x)=0 is halfway between its
roots
1
3
and 1 (as it is quadratic in this interval), i.e., at x= 23 . This corresponds to adding twice as
much wild-type DNA as there was unknown DNA and gives a maximum separation of 16 F,


between the heterogygous and homozygous SNP curves, or
1
3
of the original separation 12 F

between the heterozygous melting curve and the other two. The separation between the wild-type
 be larger.
melting curve and the other two melting curves will

The only places s(x) is not differentiable is where it changes form, i.e., at x= 17 and x= 13 .
Comparing the values at these points and x= 23 , we find the optimal mixture fraction occurs at



x= 17 , as indicated in Fig. 3. For this reference DNA fraction, at the temperature of maximum
separation, the melting curves of mixtures with heterozygous and wild type samples are

24
49
F apart,
barely reduced from the separation 12 F = 24
48 F when the same samples were not mixed. What we

have gained is that at this reference DNA fraction, instead of overlapping the wild type curve when
 the
 melting curve of a mixture with a homozygous sample, is exactly
the samples were not mixed,
halfway between melting curves of mixtures with wild type and heterozygous samples,
12
49
F away
from both at the temperature of maximum separation.
In retrospect, we can give a simple heuristic explanation for this value, whenwe recognize that it
corresponds to adding one part wild-type DNA to six parts unknown sample. As we saw above, the
the melting curves will be optimally separated when the homozygous mutant curve is equally
separated from both the wild-type and heterozygous melting curves, so the heteroduplex content of
the mixture with a homozygous sample must be exactly half that of a mixture with a heterozygous
sample. The ratio of 1 part wild-type to 6 parts unknown is optimal because when we divide 6 in
equal parts (3+3, representing the heterozygous sample strands), add 1 to one of the parts (4=3 wild
type sample plus one reference strand) and multiply (3)(4)=12 to represent heteroduplexes formed,
we obtain exactly twice the product of the original number (6, representing the homozygous SNP
strands) multiplied by one (reference strand.)
So at the simplest level, it is because ( 12 6) ( 12 6+1) = 2 (6)(1) that the optimal reference DNA
fraction is . This is visualized in the animation
 
http://www.math.utah.edu/~palais/optimal_mixing.html
Download