One goal of genomics is to find genetic variation that predicts

advertisement
Title: Digital Genotyping and Haplotyping with Polymerase Colonies
Rob Mitra et al.
Harvard Medical School, Lipper Center for Computational Genetics, 200
Longwood Ave., Boston, MA 02115.
116107171
2/13/2016 3:00:00 AM
Page 1 of 30
Abstract: The polymerase colony (polony) technology amplifies
multiple individual DNA molecules in a thin acrylamide gel on the surface
of a glass microscope slide. In each resulting polony of double stranded
DNA, one strand is covalently attached to the gel. We genotype these
polonies by performing single base extensions with dyelabeled
nucleotides, and we demonstrate the accurate quantitation of two allelic
variants using this technology. We also show that polony technology can
be used to directly determine the phase, or haplotype, of two single
nucleotide polymorphisms (SNPs). We correctly determined the genotype
and phase of three different pairs of SNPs. In one case, the distance
between the two SNPs is 45 kilobases, the largest distance achieved to
date without separating the chromosomes by cloning or somatic cell
fusion. The results indicate that polony genotyping and haplotyping may
play an important role in understanding genetic variation.
Introduction
One goal of genomic science is to find genetic variation that predicts
susceptibility to disease. Individuals who have been identified as being at risk
could then change their diet, lifestyle, or environment to reduce their chances of
developing disease. For patients who have already developed disease, genetic
markers could guide the choice of therapy to increase the likelihood of a
successful outcome.
116107171
2/13/2016 3:00:00 AM
Page 2 of 30
Most researchers who study DNA variation are focusing on single
nucleotide polymorphisms (SNPs), as these are the most common variations in
the human population. By studying candidate genes and performing genomewide linkage studies, scientists are trying to hone in on the “causative SNP”- the
SNP that alters gene function and increases the risk of disease. However,
recent studies[Drysdale, 2000 #1; Hoehe, 2000 #24] suggest that, for some
genes, there may not be one single SNP that is responsible for altering protein
function or expression - and thereby causing disease, but, instead, multiple SNPs
that interact to alter function or expression[Davidson, 2000 #20]. Furthermore,
this phenotype only occurs when these SNPs are present on the same
chromosome, so one must determine the haplotype of these SNPs to find a
correlation to the observed phenotypes. In these cases, then, we have traded
the notion of a causative SNP for that of a causative haplotype.
What existing technologies allow one to the haplotype, or phase, of a pair
of SNPs? Currently, the most common approach is to first genotype the SNPs to
acquire unphased data from multiple related individuals and to then infer the
haplotype computationally[Stephens, 2001 #4; Clark, 1998 #1; Excoffier, 1995
#2; Hawley, 1995 #3; Hoehe, 2000 #24; Niu, 2002 #6]. The development of this
methodology has greatly increased the power of both linkage studies and
candidate gene studies. However, the computational inference of haplotypes
has been estimated to be only 75-95% accurate[Stephens, 2001 #4; Niu, 2002
#6;Tishkoff, 2000 #13], making this technique an unlikely candidate for use in a
clinical setting, as well as presenting challenges when used as a research tool.
116107171
2/13/2016 3:00:00 AM
Page 3 of 30
Daly et. al[Daly, 2001 #27] and others[Patil, 2001 #31; Gabriel, 2002 #28] have
shown that SNPs tend to be inherited in larger haplotype blocks than previously
thought and that there are relatively few variants of each block. This observation
has sparked a public effort to characterize all common haplotypes in the human
population[Couzin, 2002 #32].
Prior knowledge of common haplotype blocks
may make it easier to infer the phase of SNPs that lie within the same haplotype
block[Zhang, 2002 #18]. However, even with this knowledge, it will be difficult to
accurately predict the haplotype of two SNPs that are in different haplotype
blocks because these two SNPs will not typically be in linkage disequilibrium.
These cases may be quite common since the genome contains about 100 kb per
gene and the average haplotype block is only 22 kilobases in European and
Asian populations and 11 kilobases in Yoruban and African-American
populations[Gabriel, 2002 #28]. This point is well illustrated by two known
mutations, R347->H and A970->D, in the CFTR gene[Clain, 2001 #17] that are
separated by 65 kbp. When present in cis, they interact to produce more severe
symptoms of cystic fibrosis than when present in trans, or when only one
mutation is present. If haplotypes are to be used in the clinic as a prognostic
marker, a direct molecular haplotyping technology is necessary.
Current methods for the direct determination of haplotypes have clear
limitations. Allele specific PCR[Michalatos-Beloin, 1996 #14] and single
molecule PCR[Ruano, 1990 #2] require significant optimization and cannot
routinely determine the phase of SNPs separated by more than 10-15 kilobases.
Atomic force microscopy[Woolley, 2000 #21] is an interesting alternative, but it is
116107171
2/13/2016 3:00:00 AM
Page 4 of 30
unclear how easily this technology can be scaled up, and it requires expensive
equipment not commonly found in a molecular biology laboratory. Methods in
which chromosomal fragments are cloned into BACs or in which somatic cell
hybrids[Patil, 2001 #31;Douglas, 2001 #22] are made are not cost effective when
one is interested in phasing a small number of SNPs for a large number of
samples[Douglas, 2001 #22], as would be necessary for a clinical diagnostic.
Here, we present a method to determine haplotypes using polymerase
colony (polony) technology, a technology in which a large number of individual
DNA molecules are cloned, amplified, and analyzed on a glass microscope
slide[Mitra, 1999 #3]. We determined the phase of three different pairs of SNPs
up to 45 kilobases apart. In principle, distances of hundreds of megabases
(whole chromsomes) are possible. This technology requires very little DNA as
input - we show that a buccal swab provides enough DNA to perform hundreds of
reactions. We also demonstrate that a large number of polony assays can be
performed on a single microscope slide, reducing the cost per assay.
As a prerequisite to determining haplotypes, it was necessary to
demonstrate that polony technology could be used to determine genotypes.
Polony genotyping also has many applications such as detecting loss of
heterozygosity[Zhou, 2001 #7], quantifying allelic imbalance[Yan, 2002 #9] and
the detection of rare somatic mutations in a background of wild-type DNA[Lizardi,
1998 #34]. Therefore, in addition to demonstrating haplotyping, we also present
data that demonstrates the utility of polony genotyping for these applications.
116107171
2/13/2016 3:00:00 AM
Page 5 of 30
Materials and Methods
Polony Amplification
Template (50 to 100,000 molecules) was added to the polony amplification
mixture [10mM Tris-HCl pH 8.3, 50mM KCl, 0.01% gelatin, 1.5mM MgCl2, 200M
dNTPs, 10U JumpStart Taq (Sigma), 5.91% acrylamide, 0.09% bis-acrylamide,
0.5M forward primer (with acrydite group), 0.5M reverse primer, 0.1% Tween
20, 0.2% BSA]. Ammonium persulfate and TEMED were added to a final
concentration of 0.083%. A 15m thick gel was poured on a glass microscope
slide that was partially covered with a teflon coating (Erie Scientific). The teflon
coating served as a spacer between the glass surface of the slide and a glass
coverslip (20mm x 30mm no. 2 - Fisher Scientific). The gel was allowed to
polymerize under argon for 30 minutes. The coverslip was overlaid with mineral
oil and the slide cycled using the following program: denaturation (2 minutes at
94C) 40 cycles (30s at 94C, 30s at 56C, 1min at 72C), and extension (2 min
at 72C). After cycling, the mineral oil was removed by rinsing the slides in
hexane.
Polony amplification for the haplotyping reactions were performed as
above except four primers were used (two forward and two reverse primers) at a
concentration of 0.25M each primer.
The polony protocol was modified to amplify polonies in gels that
contained the cleavable crosslinker DATD(see results). Instead of polymerizing
the acrylamide gel with the template and PCR reagents present, we polymerized
the gel first and later diffused in the DNA template molecules and PCR reagents.
116107171
2/13/2016 3:00:00 AM
Page 6 of 30
In this protocol, we made the gel mix [Tris-HCl pH 8.3, 50mM KCl, 0.01% gelatin,
1.5mM MgCl2, 7.6% acrylamide, 0.36% DATD, 0.036% bis-acrylamide, 0.5M
acrydite modified reverse primer, 0.1% Tween 20, 0.2% BSA], and then added
ammonium persulfate and temed, to a final concentration of 0.083%. We poured
a 15m thick gel on a glass microscope slide that was partially covered with a
teflon coating (Erie scientific). The teflon coating served as a spacer between
the glass surface of the slide and a glass coverslip (20mm x 30mm no. 2 - Fisher
Scientific). The gel was allowed to polymerize under argon for 30 minutes. The
slides were washed in deionized water, allowed to dry and stored under a
vacuum until use. To perform polony amplification, we took PCR amplification
mix [500 - 5 x 104 molecules/ul template, Tris-HCl pH 8.3, 50mM KCl, 0.01%
gelatin, 0.2% BSA, 0.1%Tween 20, 0.5M primer PR1 pcr2.1-R, 200M dNTPs,
0.335 units/ul Jumpstart Taq] and covered the polymerized gel for 2 minutes and
then removed excess fluid. The gel was covered with 35 l of mineral oil and
covered with a coverslip. The slides were cycled as follows: denaturation (2
minutes at 94C) 44 cycles (30s at 94C, 45s at 56C, 90s at 72C). After
amplification the DATD crosslinker was cleaved by treating the slides with
100mM NaIO4 for 15 minutes at room temperature. Next we washed the slides in
deionized water for 5 minutes, in inactivation buffer [50mM ethanolamine,
100mM Tris-HCL pH 9.0, 0.1% SDS] for 30 minutes at room temperature, and in
deionized water for 5 minutes.
116107171
2/13/2016 3:00:00 AM
Page 7 of 30
Denaturing Polony Gels
After polony amplification, the unattached DNA strand was removed by
incubating in 70C denaturing buffer [70% formamide, 1x SSC] and
electrophoresing in 0.5x TBE with 42% urea for 1 hour at 5-10 v/cm. The slides
were then washed 2x4minutes in Wash buffer 1 [10mM Tris-HCl pH 7.5, 50mM
KCl, 2mM EDTA, 0.01% triton x-100].
For polony haplotyping reactions, after the first SBE, the extended primers
were removed by washing in 70 degree denaturing buffer and the slides were
washed 2x5' in dH20.
Single Base Extension (SBE) Reactions
The SBE reactions used in the polony haplotyping and genotyping experiments
in this study were carried out using fluorescent deoxynucleotides. To do so, the
acrylamide gel was covered with a frame seal chamber (MJ Research) and
annealing mix [0.25 M SBE primer, 6x SSPE, 0.01% triton-x100] was added
over the gel. The slides were heated at 94C for 2 minutes, then at 56C for 15
minutes. We removed unannealed primer by washing the slides 2 x 4 minutes in
wash buffer 1 and then equilibrated the slides in 1x Klenow buffer [10mM TrisHCL pH 7.5, 10 mM MgCl2]. Next, we covered the gel with 40 microliters of
extension mix[1x Klenow buffer, Klenow exo - polymerase Xunits/l, E.coli Single
stranded binding protein, 1M Cy3 or Cy5 labeled deoxynucleotide] for two
minutes and then washed the slides in wash buffer 1. The slides were scanned
116107171
2/13/2016 3:00:00 AM
Page 8 of 30
on a scanning confocal microscope designed for microarrays (Scanarray 5000,
GSI Luminomics).
SBE reactions with dideoxynucleotides were performed as follows: The
acrylamide gel was covered with a frame seal chamber (MJ Research) and
annealing mix [0.25M sequencing primer, 6x SSPE, 0.01% triton-x100] was
added over the gel. The slides were heated at 94C for 2 minutes, then at 56C
for 15 minutes. Unannealed primer was removed by washing the slides 2 x 4
minutes in wash buffer 1. The slides were equilibrated in 1x Amplitaq FS buffer
[10mM Tris-HCL pH 8.0, 50mM KCL, 1.5mM MgCl2]. Next, the gel was covered
with 40 microliters of extension mix[1x Amplitaq FS buffer, 2M FITC-12-ddUTP,
2M ROX-ddCTP, 2M Cy5-ddATP, 2M Cy3-ddGTP, Amplitaq FS Xunits/l,
E.coli Single stranded binding protein ]. The gel was covered with a frame seal
chamber and heated to 55 degrees for 4 minutes. A wash in wash buffer 1 was
performed and the slides were scanned on a scanning confocal microscope
designed for microarrays (Scanarray 5000, GSI luminomics).
Image Analysis
Images of polony gels were acquired in TIF format. The images were filtered
using a Wiener filter and a median filter to remove speckle and noise. The
background was subtracted and polonies were computationally identified using
the ImageQuantNT software package. This package quantified the fluorescent
intensity of each polony and output the data as a text file. Overlapping polonies
were identified using a MATLAB script, HAPCALL which is available at
116107171
2/13/2016 3:00:00 AM
Page 9 of 30
http//arep.med.harvard.edu. For the polony genotyping experiment in which the
relative abundances of two alleles are measured, the images were smoothed,
polonies identified, and their genotypes were determined using the MATLAB
script polony_call.m also available at http://arep.med.harvard.edu
Oligonucleotides and Patient DNA
All primers used to perform the polony amplification reactions were
designed using Primer 3 software[Rozen, 1998 #41]. We found it was necessary
to set the following parameters in order to obtain good results:
PRIMER_OPT_SIZE=25, PRIMER_MIN_SIZE=19, PRIMER_MAX_SIZE=30,
PRIMER_OPT_TM=70, PRIMER_MIN_TM=64, PRIMER_MAX_TM=73,
PRIMER_MAX_DIFF_TM=5, PRIMER_MIN_GC=45, PRIMER_MAX_GC=80.
For some experiments, the parameter PRIMER_PRODUCT_SIZE_RANGE=90100 was used. All other parameters were set to default values. The names and
sequences of the oligonucleotides used to amplify polonies are as follows:
Locus containing SNP DK438: Primer DK438AP.1.FM 5’
QCATTGAGTCCTTACTGTGCACACAGCTC 3’; Primer DK438AP.1.R 5’
GGGGGAAATCCACTGAGCTAAATTGC 3’. Locus containing SNP DK445-2:
Primer DK445-2AP.1.F 5’ GGTCCCCACCTAGGCCTCTGTGTTA 3’; Primer
DK445-2AP.1.RM 5’ QTGAGTCCCTCAAACCCCTTTCTTCTG 3’. Locus
containing SNP DK331: Primer DK331AP.1.FM 5’
QTGTTGGTATGGCAGAATGTAGCATGG 3’; Primer DK331AP.1.R
5’GGCGGTGAGAAAAGGTTTTAATGG 3’; Locus containing SNP C/T –13910:
116107171
2/13/2016 3:00:00 AM
Page 10 of 30
Primer IN13L126PS2F 5’ GGCCTCTGCGCTGGCAATACAG 3’; Primer
In13l126ps2RM 5’ QCCTCGTGGAATGCAGGGCTCAA 3’; Locus containing
SNP G/A –22018: Primer In9L125ps3FM
5’QGATGTCCTTAAAAACAGCATTCTCAGC 3’; Primer In9L125ps3r
5’CCATGTTGGCCAGGCTGGTCTC 3’;Model Templates for SBE
Quantitation: Primer PR1-RM 5’QCTGCCCCGGGTTCCTCATTCTCT 3’; Primer
PR1pcr2.1-R 5’ CCATGTAAGCCCACTGCAAGCTACC 3’;INSERT jays primers
here; PR1-F CCACTACGCCTCCGCTTTCCTCTC 3’
The following oligonucleotides were used as primers for the single base
extension reactions: Primer Seq 438
5’GAGCTAAATTGCACATAACTTAGTAACAGGCTTA3’; Primer Seq 445-2 5’
ACCTAGGCCTCTGTGTTAGTCTGTTTTCA 3’; Primer Seq 331
5’ACCTAGGCCTCTGTGTTAGTCTGTTTTCA 3’; Primer In9L102ps2R 5’
GGGACAAAGGTGTGAGCCACCG 3’; Primer SeqIN13ps2 5’
GGCCTCTGCGCTGGCAATACAGATAAGATAATGTAG 3’. Primer Hybe
010129-1GA 5’ TATGGGCAGTCGGTGATAGAGTGGTGGA 3’. INSERT JAYS
PRIMER HERE.
Patient DNA used to haplotype SNPs DK438, DK445-2, and DK331 was
obtained from the Coriell Institute. Patient DNA used to haplotype the SNPs G/A
–22018 and C/T –13910 was purified from buccal swabs using the MasterAmp
buccal swab DNA extraction kit (Epicentre).
Results
116107171
2/13/2016 3:00:00 AM
Page 11 of 30
Principles Underlying Polony Haplotyping. Our approach is shown in
Figure 1. One hundred to five hundred genome equivalents of patient DNA are
diluted into a mixture of acrylamide monomer, bis cross-linker, and PCR
reagents. Two pairs of primers are included in this mixture, one pair flanking the
first SNP of interest(Figure 1 inset), the other pair flanking the second SNP of
interest, and this mixture is used to pour a thin (15 micrometer) acrylamide gel on
a glass microscope slide. Because the concentration of patient DNA is so low,
the chromosomes are well separated from each other on the surface of the slide.
PCR is then performed using a PCR machine designed to accommodate slides.
Each chromosome is amplified at two loci by the PCR reaction, and the
acrylamide matrix prevents the amplification products from diffusing very far. As
such, double stranded DNA accumulates around the chromosome, forming two
overlapping polonies - each amplified from a different region on the same
chromosome molecule. A key feature of this protocol is the use of modified
primers in the PCR reaction that covalently attach one strand of the amplified
DNA to the acrylamide matrix[Rehman, 1999 #196]. This feature allows the
unattached other strand to be removed from all polonies by heating and washing
the slide, leaving single stranded templates for the subsequent single base
extension (SBE) reactions that will determine the genotypes of the two SNPs.
After genotyping all polonies, the phase of the SNPs is then determined by
identifing overlapping polonies.
116107171
2/13/2016 3:00:00 AM
Page 12 of 30
Proof-of-Principle To perform the protocol described above, it was first
necessary to establish that I) multiple polonies could be amplified from a single
molecule of DNA and II) single base extension reactions[Pastinen, 1996 #4;
Pastinen, 1997 #36; Syvanen, 1994 #35; Dubiley, 1999 #38] could be performed
on DNA covalently attached to the acrylamide gel. To confirm that two polonies
could be amplified from a single DNA molecule, we first cut a circular plasmid
template (Figure 2c) with EcoRI to make it a linear molecule. We then amplified
this linear template in a polony reaction using two sets of PCR primers – each
primer pair chosen to amplify a different region of the template molecule
(designated regions A and B in figure 2). After amplification, the polonies were
made single stranded by heating the slide and washing away the unattached
DNA strand. Next, two dye-labeled oligonucleotides were hybridized to the
gel(figure 2a). The oligonucleotide complementary to DNA sequence located in
region A was labeled with a Cy5 molecule (red), and the oligonucleotide
complementary to DNA sequence located in region B was labeled with a Cy3
(green) molecule. In a separate control reaction, we cleaved the circular plasmid
with two restriction endonucleases, EcoRI and NcoI, so that region A and region
B were no longer on the same molecule of DNA (figure 2b). When the singly cut
plasmid was used as the template for polony amplification, numerous
overlapping polonies could be identified after the hybridization, as evident from
the large number of yellow polonies in Figure 2a. The doubly cut plasmid
produced few polonies that overlapped. The polonies that did overlap did so only
near their edges and were the result of two separate DNA molecules falling near
116107171
2/13/2016 3:00:00 AM
Page 13 of 30
each other when the gel was poured. These results demonstrate that a single
DNA molecule can give rise to two overlapping polonies. We determined the
efficiency of amplification for the two primer pairs to be 85% and 81% (see
methods).
We next characterized the specificity of single base extension (SBE) on
acrylamide-immobilized DNA. We used a single dye-labeled deoxynucleotide or
dideoxynucleotide to extend primer:template duplexes by one base in a DNA
polymerase catalyzed reaction. We performed four reactions for each nucleotide
tested to determine the specificity of the SBE reaction for the correct base
relative to all possible mismatches. The results are shown in Table 1. SBE
reactions with both fluorescent deoxynucleotides and dideoxynucleotides showed
good discrimination for the correct base. We chose to use fluorescent
deoxynucleotides in our SBE reactions as they performed somewhat better and
are not as expensive as fluorescent dideoxynucleotides.
Polony Haplotyping on Patient DNA. We determined the phase of two SNPs
that are 11.8 kilobases apart on chromosome 7. These SNPs, DK438, a T->C
mutation in intron 4 of the CFTR gene, and DK445-2, a T->C mutation in intron 9
of the CFTR gene, have been previously characterized [Keen, Housman
unpublished], and their phase is known because they are in strong linkage
disequilibrium with one another. We used two patient samples that were
heterozygous at these alleles and amplified both loci in a polony reaction. Next,
we performed an SBE reaction to genotype the SNP DK438. The results for one
116107171
2/13/2016 3:00:00 AM
Page 14 of 30
sample are shown in Figure 2a . Green polonies correspond to the wild type
allele (T) and red polonies correspond to mutant allele (C). The primers were
stripped from the slides and we performed a second SBE reaction to genotype
the SNP DK445-2. Again, green polonies correspond to the wild type allele (T)
and red polonies correspond to mutant allele (C). The images were merged and
overlapping polonies were identified computationally (see Methods) and are
circled in the figure. We found 22 overlapping polonies(table 1), all of which
indicated the correct haplotype (the base T at DK438 in cis with the base T at DK
445-2).
When polonies are amplified in a polony haplotyping reaction, there is a
chance of observing overlapping polonies, not because the two polonies were
amplified from the same molecule of DNA, but because two different DNA
molecules happened to land very close to each other when the gel was poured,
and when these molecules were amplified, the resulting polonies overlap. Pairs
of overlapping polonies that occur in this fashion will not provide information
about the phase of the two SNPs being queried. To confirm that this
phenomenon did not cause an error in our called haplotypes, we estimated the
probability that the results we observed could have occurred by this process.
This probability is a function of the density of polonies on the slide, the number of
observed overlapping polonies, the maximum distance between the center of two
polonies that are called overlapping, and the number of overlapping polonies that
predict the same haplotype (see Methods). For the two samples used in the
116107171
2/13/2016 3:00:00 AM
Page 15 of 30
polony haplotyping reaction, we found the p-values to be 6e-22 and 6e-24 (Table
1), indicating that we can have high confidence in the called haplotype.
One feature of the polony haplotyping technology is that it requires very
little patient DNA, - a buccal swab should collect enough DNA for many
reactions. To test this hypothesis, we collected buccal swabs from 5 subjects
and haplotyped two SNPs recently shown to be involved in hypolactsia [Enattah,
2002 #42]. These SNPs, a G->A variant 22018 bases upstream of the gene
MCM6 (designated G/A -22018), and a C->T variant 13910 bases upstream of
the gene MCM6 (designated C/T -13910) are in strong linkage disequilibrium, so
we expected to find only one of the two possible haplotypes. The results of these
polony haplotyping reactions are summarized in table 2. For all samples, the G
variant at -22018 was found to be on the same chromosome as the C variant at 13910, consistent with the predicted linkage. For some samples, not every pair
of overlapping polonies called the same haplotype. For example, in the patient
NR, 32 pairs of overlapping polonies indicated the correct haplotype, but 3 pairs
indicated the other haplotype. These overlapping polonies were most likely
amplified from different template molecules and happened to overlap because
the DNA was plated at a relatively high density. In spite of these occasional
overlapping pairs with a dissenting prediction, the calculated p-values (Table 2)
indicate that we can have high confidence in the called haplotype.
The two pairs of SNPs that were haplotyped in the above examples were
separated by 11.8 kb and 8.1kb of genomic sequence respectively. However, in
principle, one should be able to phase two SNPs separated by any distance, as
116107171
2/13/2016 3:00:00 AM
Page 16 of 30
long as the sample is not sheared or degraded to the extent that most of the DNA
molecules in the sample are too short to contain both SNPs. To assess the
degree of degradation in our samples, we performed agarose gel
electrophoresis, and chose the sample that contained the least amount of low
molecular weight fragments. We estimated the average fragment size of this
sample to be greater than 80kb. Next, we performed polony haplotyping to
phase two SNPs, DK331 and DK445-2, in the CFTR gene that are separated by
45 kilobases. There were 153 polonies amplified at the locus surrounding SNP
DK331 (figure 4a) and 175 polonies amplified at the locus surrounding SNP
DK445-2 (figure 4b). We identified 34 overlapping polonies from the merged
scans and 32 of these indicated the same haplotype. From this, we conclude
that the A variant at SNP DK331 is on the same chromosome as the T variant at
DK445-2 (p = 2e-9)
Considerations When Working With Small Polonies. The cost of a polony
reaction is related to the size of the polonies amplified in the reaction. Smaller
polonies mean less area on the slide is needed to haplotype SNPs, and therefore
a smaller volume of the necessary reagents is used per reaction. We have
previously reported a strong dependence of polony size on the length of the
amplified PCR product ,e.g. longer PCR products result in smaller polonies[Mitra,
1999 #3]. In that study, polonies were detected by staining the DNA with an
intercalating dye, SYBR Green I. However, as we developed this haplotyping
technology, we found that the SBE reaction is not efficient on polony
amplification products greater than 500 base pairs in length, limiting our ability to
116107171
2/13/2016 3:00:00 AM
Page 17 of 30
use small polonies for genotyping or haplotyping reactions. We hypothesized
that this phenomenon is due to secondary structure in the single-stranded DNA
that is stabilized by virtue of its location inside the pore of an acrylamide gel.
Therefore, we reasoned that if we amplify in acrylamide gels with the cleavable
cross-linker DATD and then loosen-up the gel by periodate cleavage of the
cross-linker, we could improve the efficiency of the SBE reaction in the
acrylamide gel. To test this idea, we amplified two 917 base pair templates in an
acrylamide gel with (10:1) DATD:bis cross-linker using a slightly modified
amplification protocol (see methods). We cleaved the DATD cross-linker, and
performed a SBE reaction with Cy5 (red) labeled dATP and then with Cy3
(green) labeled dGTP (figure 4a). The Cy5/Cy3 ratios of 100 polonies are plotted
in figure 4b. The polonies show a clear biphasic distribution with the difference in
Cy5/Cy3 ratios between these groups measured to be 600 +/- xx. The polonies
in figure 5b are 50um in radius, however we have previously shown that polonies
as small as 6um can be amplified using a DNA template of approximately the
same length {Why weren’t these?}. These results suggest that it should be
possible to perform {How many?} many polony haplotyping reactions on a single
slides, or in modified 384 well plates[Bell, 2002 #39].
Polony Genotyping. In addition to haplotyping, the results presented above
demonstrate the ability to genotype a large number of single DNA molecules.
This ability can be used to accurately determine the ratio of two alleles of a gene,
which is important for detecting allelic imbalance, loss of heterozygosity, and
somatic mutations. We tested the ability to measure small changes in the ratio
116107171
2/13/2016 3:00:00 AM
Page 18 of 30
of two alleles by mixing genomic DNA from two inbred strains of mouse in
various ratios and performing polony amplification followed by SBE. Polonies
were counted automatically by in-house software, and the results are shown in
Figure 1-5. Significantly, the observed ratios (blue) match the expected ratios
(pink), with errors of less than +/-7.5% (95% confidence level, error bars
representing 3 standard deviations
Discussion
The polony haplotyping reactions performed in this study used two pairs of
primers to amplify two different polony products from the same molecule. This
strategy was employed due to the large distance (8-45kb) between the pairs of
SNPs haplotyped. For SNPs that are 1-4 kilobases apart, another feasible
strategy is to use one pair of primers to amplify both loci in one single polony and
then genotype both SNPs. In the strategy we employed, two polonies must be
amplified from a single DNA molecule. However, not every template molecule
included in the reaction must give rise to two polonies to correctly determine the
haplotype. In our experiments, the polony efficiency, defined as the probability
that a DNA molecule will give rise to a polony, ranged from 25% to 80% (check
these figures). Because a large number {How large} of polonies were analyzed,
accurate haplotypes were obtained. Increasing the polony efficiency would,
decrease the number of polonies that need to be analyzed. These observations
raise two questions: I) what are the parameters that affect the polony efficiency?,
and ii) what is the relationship between the accuracy of polony haplotyping, the
116107171
2/13/2016 3:00:00 AM
Page 19 of 30
polony efficiency, the number of polonies analyzed, and the plating density of the
polonies?
We have found that the presence of ungelled acrylamide in the polony gel
during thermal cycling decreases the polony efficiency. We hypothesize that
acrylamide monomer reacts with DNA at high temperature to form an adduct so
that PCR primers that flank this site will not produce a polony. This hypothesis is
supported by the observation that ungelled acrylamide monomer strongly inhibits
in tube PCR, even when 1ng of DNA is included as template(data not shown).
But, if the acrylamide is first gelled, the reaction proceeds normally. For this
reason, we degassed all reagents used the in polony amplification and we
polymerized the gels under argon. In an alternate protocol, we poured the
acrylamide gels on the slide, washed away any ungelled acrylamide, diffused in
the templates and PCR reagents and then performed the polony amplification
(see methods).
We have also found that polony efficiency depends on the design of the
primer pair used to amplify the locus. We used the Primer3[Rozen, 1998 #41]
program to design our primers, optimizing the selection parameters to maximize
polony efficiency (see methods). These parameters are listed in the Methods
section.
DNA fragmentation or degradation can also result in low measurements of
polony efficiency. If a number of DNA molecules in the polony amplification do
not contain both loci, then there will be fewer overlapping polonies, and therefore
the polony efficiency measurement will be lower than expected. This may
116107171
2/13/2016 3:00:00 AM
Page 20 of 30
explain why the DNA samples from buccal swabs(table 2), which were purified
using a fast but relatively crude protocol, displayed a lower fraction of
overlapping polonies than the other DNA samples which were purified by more
standard methods(see Methods).
{This paragraph will discuss the relationship between the polony
efficiency, number of polonies amplified, and polony density}
For some applications it is desirable to phase a large number,100 10,000, SNPs along a chromosome. We plan to multiplex the polony
amplification so that a large number of SNPs can be haplotyped in a single
reaction, and then perform many SBE reactions in order to obtain the entire
haplotype. PCR reactions can be multiplexed M=30-fold or more on a routine
basis[Kokoris, 2000 #40]. Polony amplification should be even more amenable
to multiplexing because mispriming events between any two of the attached
primers or among early amplification products is rare. Multiplexing would not
require any increase in the average polony efficiency because phasing is
transitive.If SNPs A, B and C are heterozygous and we know the phase of SNP A
and SNP B and the phase of SNP B and SNP C then we know the phase of SNP
A and SNP C. This means that, in the haplotyping reaction, it is not necessary
that any one DNA molecule amplifies M overlapping polonies, only that enough
pairs of overlapping polonies are amplified to infer the total haplotype for all
SNPs. Next, we would phase another set of M SNPs, some of which are located
further along the chromosome. This set of SNPs would include also include
SNPs phased in the previous haplotyping reaction so that the relative phase of
116107171
2/13/2016 3:00:00 AM
Page 21 of 30
the two sets of SNPs could be determined. (To achieve phasing between the
two sets, at least one SNP included in both reactions must be heterozygous in
the patient sample being tested) Using polonies of 50 micron radius, as shown in
figure x, we should be able to fit 100 reactions on a glass microscope slide.
Alternately, modified 384 well plates as described could be used[Bell, 2002 #39].
Polony technology provides an inexpensive, flexible method for
determining haplotypes. To our knowledge, no molecular haplotyping technology
has previously demonstrated the phasing of two SNPs more than 20 kilobases
apart [McDonald, 2002 #5] without separating the chromosomes by cloning or
somatic cell fusion. Here we phase two SNPs 45 kilobases apart, and there are
no apparent barriers to working with larger distances. Polony haplotyping also
requires very small amounts of patient DNA, simplifying its collection. Polony
haplotyping should be a valuable tool in understanding genetic variation.
The polony genotyping and haplotyping techniques described here are
inherently digital - each polony provides 1 bit of information. The power of a
digital genotyping for molecular biology has been demonstrated by a related
technology, Digital PCR. which has found applications in detecting loss of
heterozygosity, measuring allelic skewing, and rare mutation detection. The
polony genotyping presented here should further extend the utility of the digital
genotyping since millions[Mitra] of polonies can be counted on a single slide
resulting in extremely accurate molecular quantitation and sensitivity.
Use fewer digits in Table 2 p-values.
116107171
2/13/2016 3:00:00 AM
Page 22 of 30
116107171
2/13/2016 3:00:00 AM
Page 23 of 30
116107171
2/13/2016 3:00:00 AM
Page 24 of 30
Figure 2 notes: setting brightness 60, contrast 84 No zoom. Smoothed image
(use smooth_tif.m)
116107171
2/13/2016 3:00:00 AM
Page 25 of 30
116107171
2/13/2016 3:00:00 AM
Page 26 of 30
116107171
2/13/2016 3:00:00 AM
Page 27 of 30
Figure 5
A
B
C
Allelic Discrimination
2
1.5
Log Ratio
1
0.5
0
-0.5
0
20
40
60
80
100
120
-1
-1.5
-2
Rank
116107171
2/13/2016 3:00:00 AM
Page 28 of 30
Figure 6
1.0
Polony Count Ratio
0.9
0.8
0.6
0.5
0.4
0.3
0.1
0.0
0.0
0.1
0.3
0.4
0.5
0.6
0.8
0.9
Template Ratio
116107171
2/13/2016 3:00:00 AM
Page 29 of 30
1.0
Table 1
Patient DK438 polonies DK 445-2 Polonies Overlapping polonies CT calls CC calls Called Haplotype P-Value
14611
94
92
24
24
0 CT
5.96E-08
11321
65
77
22
22
0 CT
2.38E-07
Table 2
Patient Intron 13 Intron 9 Overlapping Polonies CC calls CT calls Called Haplotype
P-Value
NR
325
328
49
46
3 CC
3.49054E-11
RM
295
334
35
32
3 CC
2.08849E-07
DJ
232
234
38
38
0 CC
3.63798E-12
VB
286
290
25
21
4 CC
0.00045526
BW
274
274
11
10
1 CC
0.005859375
116107171
2/13/2016 3:00:00 AM
Page 30 of 30
Download