Cherie Holcomb

advertisement
NGS Data Consortium October 8, 2012
HLA Genotyping Data Generated by 454 Sequencing
Cherie Holcomb, Ph.D.
Roche Molecular Systems
picture placeholder
Medium, High, and Very High Resolution HLA Genotyping Systems
Targeted Amplicon sequencing
– Primers target up to 9 loci (format as fusion primers or “4 primer
system” w/use of Fluidigm Access Array™)
• NOTE: After amplification of gDNA, amplicons contain all adapters
& MIDs etc. for NGS sequencing
– Workflow: Amplicons processed either individually or pooled (fusion
primers) or pooled (“4 primer system”)
– 454 Life Sciences GS FLX or GS Junior for NGS
– Conexio Assign ATF 454 software (commercially available will perform
MR and HR; for VHR limited early access to Conexio Assign MPS v1.0)
Amplicon Sequencing using 454 GS FLX*
Resolution
Possible/Future
Applications
"Medium"
(MR)
"High" (HR)
Unrelated bone
marrow donor
registry screening
Research
# Primer
Pairs
8
14
Method of Amplicon
Generation
Loci
# Samples per GS FLX
run achieved
&
DQA1, DPB1
454 fusion primer plates
(workflow as above)
As above
(workflow as above)ǂ
22
&
DPA1
40-48 (conservative est)
Fluidigm Access Array™ǂ
*GS Junior can also be used, ~8x fewer samples per run
∆Abstract
#1025-LB
ǂAbstract
As above
96
454 fusion primer plates
Clinical
Disadvantages
88 (limitation as above)
Fluidigm Access Array™∆
"Very High"
(VHR)
Advantages
88 (limited by MID/PTP region
454 fusion primer plates
Storing primer plates at 4oC;
(simplification of workflow by combination if use commercial Commercially available Limited to 11 MIDs-limits #
product)
pooling amplicons possible)
of samples per run
A, B, C, DQB1, DRB1
More MIDs→higher
(also get DRB3/4/5)
Need to purchase Fluidigm
throughput; Less
instrumentation &
sample consumption &
192
Fluidigm Access Array™∆
disposables
PCR reagent
consumption
#ORO1-02
NOTE: All current products are for Research Use Only
Storing plates at 4oC; need
MR & HR plates
commercially available, to store VHR primers in diff
(VHR primers could be format (liquid) unless made
commercially available
added on by lab)
Less sample
consumption & PCR
reagent consumption
as above
Primer Set Comparison
GS GType MR, HR with added VHR* amplicons
454 GS GType HLA Primer Sets
MR Primers
CLASS I
HR Primers
VHR
• HLA-A:
2, 3
2, 3, 4
1, 2, 3, 4-5
• HLA-B:
2, 3
2, 3, 4
1, 2, 3, 4, 5
• HLA-C:
2, 3
2, 3, 4
1, 2, 3, 4, 5, 6-7
• DPA1 exon:
N/A
N/A
2
• DPB1 exon:
N/A
2
2
• DQA1 exon:
N/A
2
2
• DQB1 exons:
2
2, 3
2, 3
• DRB1 exon:
2
2
2, 3
• DRB 3, 4, 5 exon:
2
2
2, 3
CLASS II
*Numbers shown are exons, “-” indicates intron
Primer Plate Layout for Very High Resolution Sequencing
9 loci, 22 primer pairs, 11 MIDs, 10 samples per set (3 plates)
HR Plate
A
A4-5
A4-5
A4-5
A4-5
A4-5
A4-5
A4-5
A4-5
A4-5
A4-5
B
B4
B4
B4
B4
B4
B4
B4
B4
B4
B4
C
C4
C4
C4
C4
C4
C4
C4
C4
C4
C4
D
DPB1
DPB1
DPB1
DPB1
DPB1
DPB1
DPB1
DPB1
DPB1
DPB1
E
DQA1
DQA1
DQA1
DQA1
DQA1
DQA1
DQA1
DQA1
DQA1
DQA1
F DQB1 E3 DQB1 E3 DQB1 E3 DQB1 E3 DQB1 E3 DQB1 E3 DQB1 E3 DQB1 E3 DQB1 E3 DQB1 E3
G
H
A
B
C
D
E
F
G
H
VHR Plate
A1
A1
B1
B1
C1
C1
B5
B5
C5
C5
C6-7
C6-7
DPA1
DPA1
DRB E3 DRB E3
A1
B1
C1
B5
C5
C6-7
DPA1
DRB E3
A1
B1
C1
B5
C5
C6-7
DPA1
DRB E3
A1
B1
C1
B5
C5
C6-7
DPA1
DRB E3
A1
B1
C1
B5
C5
C6-7
DPA1
DRB E3
A1
B1
C1
B5
C5
C6-7
DPA1
DRB E3
A1
B1
C1
B5
C5
C6-7
DPA1
DRB E3
A1
B1
C1
B5
C5
C6-7
DPA1
DRB E3
A1
B1
C1
B5
C5
C6-7
DPA1
DRB E3
11
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
Neg (-)
12
Commercially
avail from 454
A
B
C
D
E
F
G
H
MR Plate
1
2
3
4
5
6
7
8
9
10
A2
A2
A2
A2
A2
A2
A2
A2
A2
A2
A3
A3
A3
A3
A3
A3
A3
A3
A3
A3
B2
B2
B2
B2
B2
B2
B2
B2
B2
B2
B3
B3
B3
B3
B3
B3
B3
B3
B3
B3
C2
C2
C2
C2
C2
C2
C2
C2
C2
C2
C3
C3
C3
C3
C3
C3
C3
C3
C3
C3
DQB1 E2 DQB1 E2 DQB1 E2 DQB1 E2 DQB1 E2 DQB1 E2 DQB1 E2 DQB1 E2 DQB1 E2 DQB1 E2
DRB E2 DRB1 E2 DRB1 E2 DRB1 E2 DRB1 E2 DRB1 E2 DRB1 E2 DRB1 E2 DRB1 E2 DRB1 E2
454 Amplicon Sequencing File Generation
454
Pyrosequencing
Image Acquisition
PNG
Image
Processing
Image
Processed
CWF
Signal
Processing
FNA
Signal
Processing
(FASTA)
Signal
Processing
SFF
Consolidation
(454 AVA software)
Consolidated
FNA
(FASTA)
Examine seq
Genotyping
(Conexio Assign)
Genotype
Report
+
Sequence
Export
Signal
Processed
CWF
Conexio Assign ATF 454 Interface
Genotypes automatically assigned , sequences visible
Conexio Assign ATF 454 Interface
Genotype Report allele name format and output format can be chosen
Conexio Assign MPS v1.0 Genotyping Report
All fields, MS Excel format
A CWD filter OR “highlighting”
of CWD alleles in report
(preferred) has been requested
454 GS GType HLA +VHR primers—only part of report shown; assay includes DQA1, DQB1,
DRB1 and DRB3/4/5
GS GType HLA HR primers, Conexio Assign MPSv1.0
References for gDNA; Noncoding sequence can be considered
HLA-A genotype: Ambiguity String includes A*03:01:01:02N
NC seq not
activated
Null
GS GType HLA VHR primers, Conexio MPSv1.0
Noncoding sequence is activated, ambiguity string greatly reduced
HLA-A genotype: A*02:01:01:01/02L, A*03:01:01:01
NC seq
activated;
Null
resolved
CHALLENGE: How could/should these sequences be reported?
Reporting of Sequence Information
Currently
• Can report out (combined) consensus exon sequence that has
given rise to list of possible genotypes for a given sample/locus.
Can do this easily for all samples. (Q: If community decides
sequence is necessary for publications, is this sufficient?)
– Cannot report component (consensus) sequences (with
exons matched) to give individual allele(s)
– Doesn’t include intronic sequence (but can report
consensus of each intron individually—too laborious to be
practical)
– FASTA format in notepad (Q: Sufficient for publication?)
Reporting of Sequence Information
Preferred
• Option to report component sequences (with exons and
introns matched) to give allele calls—imp for reporting new
alleles
• For (combined or individual alleles) consensus sequence
– Option to report Coding only OR Coding plus Noncoding
– Format options including XML (accepted by IMGT)—imp for
reporting new alleles
Works in Progress
GS GType HLA VHR primers, Conexio MPSv1.0
New allele can be identified
Genotype has 1 mismatch w/IMGT database; can determine in which
allele
GS GType HLA VHR primers, Conexio MPSv1.0
DRB1*12 allele is a perfect match with IMGT database
GS GType HLA VHR primers, Conexio MPSv1.0
New allele is identifiable
DRB1*07:01 allele has 1 mismatch w/IMGT database (A at b259
instead of G) Confirmed by Sanger sequencing
Proposed acceptance criteria: Sequenced multiple times (2 different runs); Minimum
read depth of 25 for each direction, each allele for (all) amplicons of prospective new
allele; Mutation(s) defining new allele observed in both F and R direction
Reporting Sequences for New Allele
Using info from Res Layers, manually harvest sequences and
assemble 1 allele
“Copy Sequence” Output
Simple Text file: Copy into Word Pad, Excel, Bioedit
Not in FASTA or XML format (currently no way to convert to latter)
Assume XML is most appropriate for submission to IMGT database
In discussion with Conexio
Gaps in IMGT database create ambiguity in typing
Gaps indicated by “orange bar” in user interface but not in
Genotyping Report
etc.
Issue: Would be good if alleles lacking sequence were flagged in Genotyping Report
In discussion with Conexio
Additional info & Summary
• Using HR or VHR 454 sequencing HLA genotyping system including
Conexio Assign ATF 454 or MPS v1.0 software, respectively:
– Ambiguity string lengths are reduced to a practically reportable size
– Genotype/ allele ambiguity strings (in various formats using
combinations of delineation in columns, “+”, “or”, “,”) can be reported
in Excel, text and XML(??) format at 1, 2, or all field level.
– NMDP codes supported
– Most recent IMGT nomenclature and references supported (updated
with periodicity, 6 months); version of references used is reported
– Export of consensus sequence used to make genotype calls for all
loci/all samples is easily accomplished in FASTA format—currently
doesn’t include NC sequence.
– New alleles readily identifiable, however, reporting of amplicon
sequences currently only possible by manual “harvesting” into text
file.
Acknowledgements
• Roche Molecular Systems
– Henry Erlich
• Conexio Genomics
– Damian Goodridge
We Innovate Healthcare
22
Back-up slides
23
Ambiguity C*03:03/ 03:20N
GS GType HLA HR primers
Ambiguity Resolution C*03:03/ 03:20N
VHR primers
Download