Welcome Slide

advertisement
Welcome to UW-Madison, the
WNPRC, and O’Connor Lab!
MHC Genotyping Workshop
November 7th – 11th, 2011
Madison, Wisconsin
Introductions
• Trainers (WNPRC
Genetics Service)
–
–
–
–
–
Roger Wiseman
Julie Karl
Simon Lank
Gabe Starrett
Francesca Norante
• Participants
–
–
–
–
–
–
Wendy Garnica
Mark Garthwaite
Julie Holister-Smith
Suzanne Queen
Premeela Rajakumar
Yuko Yuki
Schedule of Events
• Monday
– Welcome and Overview
Presentation
– Begin bench work: cDNA
synthesis & PCR (run #1)
• Tuesday
– PCR product purification,
quantification & pooling (run
#1)
– Begin emulsion PCR (run #1)
– Begin bench work (run #2)
• Wednesday
– Break & enrich DNA beads (run
#1)
– Run Roche/454 GS Junior
instrument (run #1)
– emPCR (run #2)
• Thursday
–
–
–
–
View run #1 results
Continue work on run #2
Informatics presentation
Data analysis
• Friday
– Run #2 results
– Continue Data Analysis &
Wrap-up
Overview of Presentation
• Our lab & research focus
• Evolution of DNA sequencing technology
• Discussion of Roche/454 technology & sample
multiplexing
• MHC genotyping method overview
– NHP immunogenetics
– Genotyping strategy
– Workflow
• Genotyping results
Welcome to Madison!
WNPRC
Welcome to Madison!
The Wisconsin National Primate
Research Center (WNPRC)
• Only federally funded National Primate
Research Center in the Midwest
• Center holds ~1,100 rhesus macaques, 200
marmosets, and 100 cynomolgus macaques
• Research strengths:
– Immunogenetics & Virology
– Aging & Metabolism
– Reproductive & Regenerative Medicine
The O’Connor Laboratory
Genetics Services Members
The O’Connor Laboratory
Genetics Services Members
The O’Connor Laboratory: Research
• NHP immunogenetics (MHC class I, class II, KIR)
– Cynomolgus Macaque (Mauritian, Indonesian, SE
Asian)
– Rhesus Macaque (Indian & Chinese)
– Japanese Macaque, Vervet, Sooty Mangaby
• SIV pathogenesis (immunology) and viral
evolution
• Human immunogenetics (HLA) and HIV variation
The O’Connor Laboratory: Research
• NHP immunogenetics (MHC class I, class II, KIR)
– Cynomolgus Macaque (Mauritian, Indonesian, SE
Asian)
– Rhesus Macaque (Indian & Chinese)
– Japanese Macaque, Vervet, Sooty Mangaby
• SIV pathogenesis (immunology) and viral
evolution
• Human immunogenetics (HLA) and HIV variation
Sequencing Technology is Changing
• Micro sequencing reactions
– Pyrosequencing
– Single molecule sequencing
• Higher throughput
– Millions of sequences per day
• Lower cost
– $10,000 human genome
(original HGP = $3 billion)
Sequencing Technology: Overview
• 1st Generation (previous): Sanger sequencing
Applied Biosystems 3730xl: 1 x 103 reads / day
- 500 to 1,000 bp read length
Sequencing Technology: Overview
• 2nd Generation (current): 454, Illumina, SoLID,
Ion torrent
Roche / 454: 1 x 106 reads / day
- 500 to 800 bp read length
Illumina: 2 x 109 reads / week
- 100 or 200 bp read length
Sequencing Technology: Overview
• 3rd Generation (future): Pacific Biosciences,
Nanopore sequencing, Complete Genomics
Pacific Biosciences: 1 x 105 sequences / hour
- 1,000 to 10,000 bp reads (?)
- Single molecule sequencing
- Goal = $1,000 genome !
Sequencing Technology: Overview
• 1st Generation (previous): Sanger
– Slow, Expensive, Not clonal, easy to analyze
• 2nd Generation (current): 454, Illumina, SoLID,
Ion torrent
– Faster, Cheaper, Clonal, hard to analyze
• 3rd Generation (future): Pacific Biosciences,
Nanopore sequencing, Complete Genomics,
Helicos
– Very fast, Very cheap, Impossible to analyze
Roche / 454 Sequencing
How does it work?
Flowgram (instead of chromat)
O’Connor Laboratory Sequencing
2005
2006
2007
2008
NHP MHC class I genotyping with E. coli based
cloning and Sanger sequencing: Throughput of ~
8 animals per week.
Sanger sequencing
2009
2010
O’Connor Laboratory Sequencing
2005
2006
2007
2008
MHC class I genotyping pilot project: ~24
samples per week
Pilot with Roche
sequencing
center
Sanger sequencing
2009
2010
O’Connor Laboratory Sequencing
2005
2006
2007
2008
2009
MHC class I genotyping at UIUC, ~ 48 samples
per week
GS FLX at UIUC
Pilot with Roche
sequencing
center
Sanger sequencing
2010
O’Connor Laboratory Sequencing
2005
2006
2007
2008
2009
2010
MHC class I full-length sequencing project with
Roche using Titanium chemistry
Titanium pilot with Roche
sequencing center
GS FLX at UIUC
Pilot with Roche
sequencing
center
Sanger sequencing
O’Connor Laboratory Sequencing
2005
2006
2007
2008
2009
MHC class I and viral sequencing projects run inhouse ( > 48 samples per week )
2010
GS Junior
in lab
Titanium pilot with Roche
sequencing center
GS FLX at UIUC
Pilot with Roche
sequencing
center
Sanger sequencing
Roche/454 Sequencing Advantages
• Inherently clonal (no bacterial cloning needed)
• Far cheaper per base than Sanger (3 – 4 orders
of magnitude)
• Reliable read number and data regularity
• Easy protocol: many people trained
GS Junior 5 Month Run Summary
MHC Class I 568bp Amplicon – 9 runs
Average
70,848 HQ reads
523 bp median length
Highest
101,711
526
Lowest
33,552
521
101,846 HQ reads
360 bp median length
Highest
177,642
494
Lowest
42,949
147
SIV Whole Genome – 16 runs
Average
SIV Epitope Amplicons (Various Sizes) – 5 runs
Average
80,244 HQ reads
369 bp median length
Highest
107,605
388
Lowest
37,066
356
Ease of Use
 Access to instrument since Jan 2010
 34 different fully-trained operators to date
 7 additional people have begun training, but
have not yet completed a solo run
Ease of Use
 Access to instrument since Jan 2010
 34 different fully-trained operators to date
 7 additional people have begun training, but
have not yet completed a solo run
Ultra-Deep vs. Ultra-Wide Sequencing
• 2nd & 3rd Generation = thousands / millions of
sequences per run
• Cost per run is high ($1000s)
• Can examine polymorphic target at high depth
(ultra-deep)
– expensive
• Can sequence many samples sequenced at the
same time (ultra-wide)
– cheap
Ultra-Deep vs. Ultra-Wide Sequencing
• Significantly improves sensitivity over traditional
Sanger-based sequencing (500x vs 2x coverage)
Ultra-Deep vs. Ultra-Wide Sequencing
Ultra-deep
Ultra-wide
• HLA Typing
• Allele frequencies
• SNP detection
• Low frequency ARV resistance
• TCR sequencing
• Antibody sequencing
Multiplexed (Ultra-wide) Amplicon
Sequencing
Multiplex
Identifier
MID Tag
Methods to increase multiplexing
1. Physically subdividing plate (gasket)
2. Sample specific MID sequence tags
3. Uniquely mixing 5’ & 3’ MID tags
Patient
1
2
3
4
5
6
7
8
9
MID
ATCGTAGTCA
TCCGATCGA
GTGTAACGT
CCATGGATC
TGGATGCAG
TAGTAGCCA
GTAGTCTAA
AACGATGCA
GCGCTAGCA
2.
1.
Patient
1
2
3
4
5
6
7
8
9
5' MID
1
1
1
2
2
2
3
3
3
3.
3' MID
1
2
3
1
2
3
1
2
3
O’Connor lab sequencing projects
• NHP comprehensive MHC genotyping & allele
discovery (amplicons)
Importance of MHC Class I
Host Immune Genetics
Source: modified from Yewdell et al., Nature
Reviews Immunology 2003
 MHC class I molecules
dictate immunity to
disease
 High degree of
polymorphism within
the MHC class I
peptide-binding
domain
 Specific MHC alleles
associated with
superior control of HIV
infection
NHP MHC Class I Allele Libraries
700
663
Total # Alleles in GenBank
600
460
500
400
300
200
156
100
0
Rhesus
Macaque
Cynomolgus
Macaque
Pig-tailed
Macaque
9
0
Vervet
Sooty
Mangabey
NHP MHC Class I Allele Libraries
700
663
Total # Alleles in GenBank
600
460
500
400
300
200
156
100
0
Rhesus
Macaque
Cynomolgus
Macaque
Human HLA class I = 5,400 alleles
Pig-tailed
Macaque
9
0
Vervet
Sooty
Mangabey
Human HLA vs NHP MHC Class I
Human HLA class I
A
C
B
A
C
B
Human HLA vs NHP MHC Class I
Human HLA class I
A
C
B
A
C
B
Nonhuman primate MHC class I
A1 A2 A3
A4
B1 B2 B3 B4 BN
A1 A2 A3
A4
B1 B2 B3 B4 BN
MHC Genotyping Design
α1 Domain
α2 Domain
α3 Domain
Transmembran
e
Cytoplasmi
c
568bp Amplicon
100
80
60
40
F
R
20
0
1
9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145
153
161
169
177
185
193
201
209
217
225
233
241
249
257
265
273
281
289
297
305
313
321
329
337
345
353
361
% MHC Class I Variability
Leader
Peptide
Amino Acid Position
• 568bp amplicon captures highly variable
peptide binding region flanked by conserved
sequences
• Amplifies in multiple primate species
• Longer reads provide better resolution of
alleles
MHC Genotyping Design
Primer = Adapter (A or B) + MID + sequence-specific
568bp Amplicon
MHC Genotyping Design
Primer = Adapter (A or B) + MID + sequence-specific
568bp Amplicon
Within a single nonhuman primate sample:
MHC Genotyping Design
Primer = Adapter (A or B) + MID + sequence-specific
568bp Amplicon
Within an MHC class I amplicon genotyping pool:
Roche/454 MHC Workflow
• Total RNA isolation and cDNA synthesis
– RNA isolation ~4 hrs; cDNA synthesis ~2 hrs
• Primary PCR amplification
– plus SPRI purification, quantification, pooling ~3 hrs
• emPCR
– set-up ~1 hr, run ~5.5 hrs
• Breaking and enrichment
– ~3 hrs
• GS Junior run
– set-up ~1.5 hrs; run time ~10 hrs
• Data processing and analysis
www.454.co
m
– run processing ~2 hrs;
– analysis time varies
GS Junior Run Metrics – MHC
Reads per Sample
Sample
Monkey001
Monkey002
Monkey003
Monkey004
Monkey005
Monkey006
Monkey007
Monkey008
Monkey009
Monkey010
Monkey011
Monkey012
Monkey013
Monkey014
Monkey015
Monkey016
Monkey017
Monkey018
Monkey019
Monkey020
Monkey021
Monkey022
Monkey023
Monkey024
Monkey025
Monkey026
Monkey027
Monkey028
Monkey029
Monkey030
Monkey031
Monkey032
Monkey033
Monkey034
Monkey035
Monkey036
Monkey037
Monkey038
Monkey039
Monkey040
Monkey041
Monkey042
Monkey043
Monkey044
Monkey045
Monkey046
Monkey047
Monkey048
MID
Read Count
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
525
392
1,023
504
450
722
622
489
344
635
660
796
653
731
1,342
628
76
481
503
633
573
463
390
723
739
560
1,672
559
801
590
548
748
583
374
226
791
618
558
438
666
250
451
612
673
570
207
604
180
Sample
Monkey049
Monkey050
Monkey051
Monkey052
Monkey053
Monkey054
Monkey055
Monkey056
Monkey057
Monkey058
Monkey059
Monkey060
Monkey061
Monkey062
Monkey063
Monkey064
Monkey065
Monkey066
Monkey067
Monkey068
Monkey069
Monkey070
Monkey071
Monkey072
Monkey073
Monkey074
Monkey075
Monkey076
Monkey077
Monkey078
Monkey079
Monkey080
Monkey081
Monkey082
Monkey083
Monkey084
Monkey085
Monkey086
Monkey087
Monkey088
Monkey089
Monkey090
Monkey091
Monkey092
Monkey093
Monkey094
Monkey095
Monkey096
MID
Read Count
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
585
504
673
565
893
581
623
955
698
792
655
1,203
428
8
391
663
411
386
625
637
367
391
585
808
594
391
578
728
612
283
475
527
27
226
113
481
52
612
733
800
647
1,094
522
756
624
912
610
514
Allele Calls & Transcript Profiles
% Total Reads
ChRh10
16
14
12
10
8
6
4
2
0
ChRh11
ChRh12
MHC Class I Alleles
Lymphocyte Specific Expression
% Total Reads
CD16
CD20
50
45
40
35
30
25
20
15
10
5
0
CD4
CD8
CD14
MHC Class I Alleles
ROGER: INSERT ADDITIONAL DATA
SLIDES?
Same methods applicable to HLA
typing
• We have developed a similar assay to
genotype human samples: HLA Class I and
DRB loci
• Cheaper, higher-resolution, and higherthroughput than existing methods
• Can genotype up to 96 individuals per GS-Jr
run
1
23
45
67
89
111
133
155
177
199
221
243
265
287
309
331
353
375
397
419
441
463
485
507
529
551
573
595
617
639
661
683
705
727
749
771
793
815
837
859
881
903
925
947
969
991
1013
1035
1057
1079
High Resolution HLA Genotyping
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
LP
α1 Domain
α2 Domain
1kb-F / 581-R (Amplicon 1)
α3 Domain
581-F / 1kb-R bp SBT (Amplicon 2)
TM
CT
High-resolution Typing for 40
Reference Cell Lines
UW ID#
HLA-Ref01
HLA-Ref02
HLA-Ref03
HLA-Ref04
HLA-Ref05
HLA-Ref06
HLA-Ref07
HLA-Ref08
HLA-Ref09
HLA-Ref10
HLA-Ref11
HLA-Ref12
HLA-Ref13
HLA-Ref14
HLA-Ref15
HLA-Ref16
HLA-Ref17
HLA-Ref18
HLA-Ref19
HLA-Ref20
HLA-Ref21
HLA-Ref22
HLA-Ref23
HLA-Ref24
HLA-Ref25
HLA-Ref26
HLA-Ref27
HLA-Ref28
HLA-Ref29
HLA-Ref30
HLA-Ref31
HLA-Ref32
HLA-Ref33
HLA-Ref34
HLA-Ref35
HLA-Ref36
HLA-Ref37
HLA-Ref38
A*
A*31:01:02
A*32:01:01
A*02:16
A*03:01:01:01/03
A*24:02:01:01/02L
A*26:02
A*30:01:01
A*02:01:01:01/02L/0
3
A*02:07
A*33:03:01
A*30:01:01
A*68:02:01:01/02/03
A*02:06:01
A*11:01:01
A*26:01:01
A*02:04
A*03:01:01:01/03
A*01:01:01:01
A*02:01:01:01/02L/0
3
A*02:01:01:01/02L/0
3
A*34:01:01
A*02:01:01:01/02L/0
3
A*01:01:01:01
A*25:01
A*30:02:01
A*01:01:01:01
A*02:05:01
A*01:01:01:01
A*03:01:01:01/03
A*01:01:01
A*02:01
A*01:01:01:01
A*24:02:01:01/02L
A*01:01:01:01
A*01:37
A*03:01:01:01/03
A*03:01:01:01/03
A*01:01:01:01
A*03:01:01:01/03
A*03:01:01:01/03
A*24:02:01:01/02L
A*02:01:01:01/02L/0
3
A*03:01:01:01/03
A*01:01:01:01
A*24:02:01:01/02L
A*24:02:01:01/02L
A*03:01:01:01/03
A*03:01:01:01/03
A*24:02:01:01/02L
A*02:01:01:01/02L/0
3
A*24:02:01:01/02L
A*24:02:01:01/02L
A*31:01:02
A*02:01:01:01/02L/0
3
A*24:02:01:01/02L
A*3402
A*7401
B*
B*51:01:01
B*38:01:01
B*51:01:01
B*40:06:01:01/02 B*51:01:01
B*13:02:01
C*
C*15:02:01
C*12:03:01:01/02
C*07:04:01
C*08:01:01
C*06:02:01:01/02
B*46:01:01
C*01:02:01
B*44:03:01
C*14:03
B*42:01:01
C*1701
B*15:01:01:01
B*35:01:01:01/02 C*03:03:01
B*08:01:01
C*07:01:01
B*51:01:01
C*15:02:01
B*47:01:01:01/02
C*06:02:01:01/02
B*57:01:01
C*06:02
C*15:02:01
C*14:02:01
C*04:01:01:01/02/03
B*35:03:01
C*12:03:01:01/02
B*35:01:01:01/02
B*15:21
B*15:35
C*04:01:01:01/02/03
C*04:03
C*07:02:01:01/02/03
B*15:01:01:01
B*49:01:01
B*51:01:01
B*18:01:01:01
B*08:01:01
B*07:02:01
B*05:801
B*39:06:02
B*35:01:01:01/02
B*07:02:01
B*07:02:01
B*35:01:01:01/02
B*35:01:01:01/02
B*50:01:01
B*58:01:01
B*07:02
B*58:01:01
B*58:01:01
B*35:01:01:01/02
B*35:01:01:01/02
B*58:01:01
B*51:01:04
C*03:04:01:01/02
C*07:01:01
C*01:02
C*05:01:01:01/02
C*06:02:01:01/02
C*07:01:01
C*07:01
C*07:01:01
C*07:01:01
C*07:02:01:01/02/03
C*07:02
C*07:02:01:01/02/03
C*04:01:01:01/02/03
C*04:01:01:01/02/03
C*04:01:01:01/02/03
C*04:01:01:01/02/03
C*07:02:01:01/02/03
C*07:02:01:01/02/03
C*07:18 (701?)
C*07:04:01
B*07:02:01
B*39:06:02
B*07:02:01
B*07:02:01
B*35:01:01:01/02
B*37:01:01
B*58:01:01
B*51:01:01
B*35:01:01:01/02
B*39:06:02
C*06:02:01:01/02
C*07:01:01
C*07:117
C*04:01:01:01/02/03
C*04:01:01:01/02/03
C*07:02:01:01/02/03
C*07:02:01:01/02/03
B*07:02:01
B*07:02:01
B*13:02:01
B*40:01:02
C*06:02:01:01/02
C*03:04:01:01/02
C*07:02:01:01/02/03
C*07:02:01:01/02/03
B*15:01:01:01
B*801
B*39:06:02
B*1503
C*03:03:01
C*02:10
C*07:02:01:01/02/03
C*701
C*07:02:01:01/02/03
C*07:02:01:01/02/03
Example High-Resolution HLA
Genotypes with DRB
Read
s
1kbF 581F 581R 1kbR DRB-F DRB-R
122
35
41
23
23
150
50
45
50
5
74
16
24
25
9
223
36
87
61
39
99
14
52
13
20
45
2
32
2
9
163
83
80
127
65
62
60
60 .
Sample
HIV_114
HIV_114
HIV_114
HIV_114
HIV_114
HIV_114
HIV_114
HIV_114
HIV_114
Allele
A*36:01
A*68:01:01
B*41:02:01
B*53:01:01
C*04:01:01
C*17:01:01 (primer)
DRB1*01:02:01
DRB1*16:02:01
DRB5*02-novel?
HIV_115
HIV_115
HIV_115
HIV_115
HIV_115
HIV_115
HIV_115
HIV_115
HIV_115
HIV_115
A*03:01:01
A*11:01:01
B*07:02:01
B*51:01:01
C*07:02:01
C*15:02:01
DRB1*04:04:01
DRB1*07:01:01
DRB4*01:01:01:01
DRB4*01:03:01:01
60
70
120
177
62
109
165
228
93
99
24
32
28
53
30
60
HIV_116
HIV_116
HIV_116
HIV_116
HIV_116
HIV_116
HIV_116
HIV_116
HIV_116
HIV_116
A*01:01:01
A*02:01:01
B*08:01:01
B*15:01:01
C*03:04:01
C*07:01:01
DRB1*03:01:01
DRB1*04:01:01
DRB3*01:01:02
DRB4*01:03:01:01
122
97
213
129
103
114
471
429
137
176
37
40
57
21
27
46
16
16
48
53
15
20
31
17
71
58
43
22
7
9
12
35
16
19
49
31
63
32
21
41
13
13
32
36
1
10
86
114
75
75
79
114
18
24
244
221
74
101
227
208
63
75
5
9
22
18
12
5
Sample
HIV_117
HIV_117
HIV_117
HIV_117
HIV_117
HIV_117
HIV_117
HIV_117
HIV_117
HIV_117
Allele
A*26:01:01
A*29:02:01
B*44:03:01 (putative)
B*44:10 (putative)
C*04:01:01
Reads 1kbF 581F 581R 1kbR DRB-F DRB-R
167
24
74
40
29
96
24
31
24
17
286 112
53
59
62
210 113
51
46 .
245
38 130
26
51
DRB1*03:01:01
DRB1*07:01:01
DRB3*02:02:01
DRB4*01:03:01:01
173
171
50
44
HIV_118
HIV_118
HIV_118
HIV_118
HIV_118
HIV_118
HIV_118
HIV_118
HIV_118
A*02:01:01
A*23:01:01
B*40:01:02
B*44:03:01
C*03:04:01
C*14:03
DRB1*04:01:01
DRB1*10:01:01
DRB4*01:03:01:01
117
156
113
206
84
142
151
195
57
33
42
13
51
7
28
HIV_119
HIV_119
HIV_119
HIV_119
HIV_119
HIV_119
HIV_119
HIV_119
HIV_119
A*29:01:01:01
A*68:01:02
B*07:05:01
B*44:02:01:01
C*05:01:01
C*15:05:01/02
DRB1*04:04:01
DRB1*07:01:01
DRB4*01:03:01:01
36
73
48
86
47
63
233
250
77
13
36
12
41
25
26
46
61
50
81
47
61
7
12
11
15
5
15
24
39
35
63
15
31
10
20
7
26
10
11
94
81
25
29
79
90
25
15
80
96
33
71
99
24
89
105
33
144
145
44
14
14
15
11
15
22
6
5
18
4
7
11
Download