[CLICK HERE AND TYPE TITLE]

advertisement
International Biometric Society
PENALIZED REGRESSION MODELS FOR MULTIPLEX SNPS PYROSEQUENCING
DATA ANALYSIS
Ambroise Jérôme1, Butoescu Valentina2, Tombal Bertrand2, Robert Annie3, Gala Jean-Luc1
1
Centre for Applied Molecular Technologies (CTMA), Institut de Recherche Experimentale et
Clinique (IREC), Université catholique de Louvain, Brussels, Belgium. 2 Service d’Urologie,
Institut de Recherche Expérimentale et Clinique (IREC), Cliniques universitaires SaintLuc,Université catholique de Louvain, Brussels , Belgium. 3Epidemiology and Biostatistics
Department (EPID), Institut de Recherche Expérimentale et Clinique (IREC), Université
catholique de Louvain, Clos Chapelle-aux-Champs 30, 1200 Bruxelles, Belgium
Pyrosequencing is a cost-effective DNA sequencing technology that has many applications
including rapid Single Nucleotide Polymorphisms (SNPs) genotyping for bacterial or human
applications [1]. The chemi-luminescent signal produced during the reaction is detected in
the pyrosequencer and displayed in pyrosequencing signal (also known as pyrogram TM)
which is then translated into the corresponding nucleotide sequence. An increasing number
of clinical applications rely on the computation of a multilocus genetic score and require
therefore to genotype multiple DNA stretches. In such applications several pyrosequencing
primers can be used simultaneously in a multiplex experiment, with overlapping primerspecific pyrosequencing signals as main issue.
In this study, novelty consists in selecting the nucleotide dispensation order according to the
multiplex pyrosequencing application while carrying out signal analysis with a new signal
processing method based on a sparse representation of the pyrosequencing signal [2]. This
is performed by constructing an over-complete dictionary of standardized simplex
pyrosequencing signals. Then, a penalized linear regression model is built with the y testing
multiplex pyrosequencing signal as response and all signals from the dictionary as predictor
variables. As a proof of concept, this new signal processing method was applied to a series
(n=8) of human DNA samples to genotype nine well identified prostate risk-associated SNPs
in two pyrosequencing experiments (successive quintuplex and quadruplex experiments).
The rationale for this application is the recent demonstration that genotyping this set of 9
SNPs can improve our patient selection for prostate biopsy when combined with a prostate
cancer risk calculator [3].
High quality results were obtained with both multiplex pyrosequencing experiments and a
perfect concordance was observed between multiplex and simplex (gold-standard) results.
To the best of our knowledge, it is the first time that quadruplex and quintuplex
pyrosequencing signals are generated from single wells with each SNP being correctly
identified and assigned. Multiplex pyrosequencing enables therefore to lower the global
turnaround time of SNPs genotyping and to decrease substantially analytical reagent costs
and technician work load while providing reliable results.
References
1.
2.
3.
Ronaghi, M., Pyrosequencing sheds light on DNA sequencing. Genome Res,
2001. 11(1): p. 3-11.
Ambroise, J., et al., AdvISER-PYRO: Amplicon Identification using SparsE
Representation of PYROsequencing signal. Bioinformatics, 2013. 29(16): p.
1963-1969.
Butoescu, V., et al., Does genotyping of risk-associated single nucleotide
polymorphisms improve patient selection for prostate biopsy when combined
with a prostate cancer risk calculator? Prostate, 2013.
International Biometric Conference, Florence, ITALY, 6 – 11 July 2014
Download