file - BioMed Central

advertisement
SUPPLEMENTARY MATERIAL
INDEX
Tables S1-S10 – Page 2
Figures S1-S3 – Page 13
1
Table S1. Target gene list.
ABO
ADAMTS13
ADRB1
ADRB2
ADRB3
ADRBK1
ANXA5
APCS
APOH
ARHGEF1
AXL
C1orf114
C1QB
C1R
C1S
CADM1
CALM1
CALM2
CALM3
CALR
CASP8AP2
CD34
COL1A1
COL1A2
COL2A1
COL3A1
COL4A1
COL4A2
COL4A3
COL4A3BP
COL4A4
COL4A5
COL4A6
COL6A1
COL6A2
COL6A3
CR2
CRP
CYP4V2
EDIL3
EPS8L2
F10
F11
F12
F13A1
F13B
F2
F2RL1
F2RL2
F2RL3
F3
F5
F7
F8
F8
F9
FCGR2A
FCGR2B
FGA
FGA
FGB
FGG
FTO
GAS6
GNAI3
GNAQ
GNAS
GNB2L1
GP1BA
GP1BB
GP5
GP6
GP9
HIF1A
HIF3A
HRH2
HRH2
HTR1A
HTR1B
HTR1D
HTR1E
HTR1F
HTR2A
HTR2B
HTR2C
HTR3A
HTR3B
HTR4
HTR6
ICAM1
ICAM2
ICAM3
ICAM4
ICAM5
IL17A
IL1A
IL1B
IL23A
IL6
ITFG2
ITGA2
ITGA2B
ITGB1
ITGB2
ITGB3
KNG1
LDLR
LPA
LRP1
MARCKS
MERTK
MET
MFGE8
MTHFR
MYBPC3
NAT8B
NFKB1
NFKB2
NOS3
ODZ1
OS9
P2RY12
PCSK9
PF4
PKN2
PKN3
PLA2G6
PLAT
PLAU
PLCB1
PLCB2
PLCB3
PLCB4
PLCG1
PLCG2
PLG
PPP1CA
PPP1CB
PPP2CA
PPP2CB
PPP3CA
PPP3CB
PRKCA
PRKCD
PRKCQ
PRKD1
PRKD2
PRKD3
PROC
PROS1
PROZ
PSEN1
PTGS1
PTGS2
PTK2B
PTX3
RASGRP2
RGS7
RND2
SCARB1
SELE
SELP
SERPINB2
SERPINC1
SERPIND1
SERPINE1
SERPINF2
TACR1
TBX2
TBXAS1
TFPI
THBD
TLN1
TLR1
TLR10
TLR2
TLR3
TLR4
TNF
TYK2
TYRO3
VASP
VHL
VHLL
VWF
ZNF544
2
Table S2. Characteristics of the individuals who underwent next-generation sequencing.
Individual ID
Age,
years
Gender
Number of
thrombotic
episodes
Type of
episodes
Pulmonary
embolism
at first DVT
BMI,
kg/m2
ATIII, %
PC, %
PS, %
PT, INR
aPTT,
ratio
Fibrinogen,
mg/dL
DVT_P_01
48
F
2
DVT, SVT
No
30.1
120
101
102
1.07
0.91
361
DVT_P_02
20
M
2
DVT (2)
No
23.3
106
134
188
1.04
0.81
278
DVT_P_03
48
M
1
DVT
No
24.6
103
116
105
0.94
1.02
388
DVT_P_04
39
M
1
DVT
Yes
24.8
104
77
153
1
1.27
301
DVT_P_05
35
M
1
DVT
Yes
24.7
111
88
101
1.01
0.92
279
DVT_P_06
32
F
2
DVT (2)
No
19.8
97
86
101
1.05
0.93
324
DVT_P_07
48
M
1
DVT
No
21.5
116
63
100
1.03
1.13
304
DVT_P_08
23
F
1
DVT
No
30.1
95
81
135
1.06
1.16
374
DVT_P_09
37
M
3
DVT, SVT (2)
Yes
34.6
87
78
166
1.02
0.94
365
DVT_P_10
55
F
3
DVT, SVT (2)
No
22.9
98
88
146
1
0.9
265
DVT_C_01
45
F
/
/
/
22.7
115
78
98
1.01
1.00
374
DVT_C_02
25
M
/
/
/
24.4
96
81
116
1.15
0.97
229
DVT_C_03
48
M
/
/
/
32.2
96
129
135
0.98
1.02
235
DVT_C_04
39
M
/
/
/
25.3
101
72
100
1.10
1.14
206
DVT_C_05
37
M
/
/
/
24.2
115
135
145
1.00
0.99
208
DVT_C_06
34
F
/
/
/
29.6
109
141
121
1.03
0.96
366
DVT_C_07
46
M
/
/
/
27.1
87
156
123
0.92
0.98
311
DVT_C_08
25
F
/
/
/
18.7
111
118
97
0.97
0.97
244
DVT_C_09
40
M
/
/
/
28.1
106
113
124
0.92
1.01
207
DVT_C_10
56
F
/
/
/
21.5
103
89
99
1.02
0.98
285
DVT_C_11
48
F
/
/
/
26.2
93
110
108
0.90
1.00
371
DVT_C_12
50
F
/
/
/
22.0
102
100
95
0.97
1.06
272
Type of episodes reports the type of thrombotic episode patient’s history was positive for, in parentheses is reported the number of episode of that type. DVT indicates deep vein thrombosis; SVT, superficial
vein thrombosis; BMI, body mass index; ATIII, antithrombin; PC, protein C; PS, protein S; PT, prothrombin time; INR, international normalized ratio; aPTT, activated partial thromboplastin time.
3
Table S3. Individual sequence and coverage statistics.
Statistics
DVT
DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT
DVT_
DVT_
_P_0
_P_0 _P_0 _P_0 _P_0 _P_0 _P_0 _P_1 _C_0 _C_0 _C_0 _C_0 _C_0 _C_ _C_0 _C_0 _C_0 _C_1 _C_1 _C_1
P_01
P_03
2
4
5
6
7
8
9
0
1
2
3
4
5
06
7
8
9
0
1
2
Raw Mb
532
592
525
422
538
513
194
471
147
367
203
544
177
472
328
488
561
471
210
548
183
Unique Mb
Unique %
On –target
Average
Coverage
1x Cov.
10x Cov.
20x Cov.
40x Cov.
325
361
326
266
305
342
293
463
176
644
222
472
176
592
186
527
173
748
230
501
175
519
197
61%
61%
62%
7%
6%
7%
63%
39%
37%
40%
41%
39%
39%
60%
63%
61%
62%
40%
36%
39%
33%
34%
32%
37%
39%
7%
7%
7%
7%
8%
6%
7%
7%
7%
7%
7%
7%
6%
7%
6%
6%
5%
7%
7%
62
62
60
53
34
33
31
28
33
32
63
56
62
58
33
38
31
33
28
33
32
34
99%
99%
99%
98%
98%
98%
98%
98%
98%
98%
99%
99%
99%
99%
98%
98%
98%
98%
99%
98%
98%
98%
96%
96%
96%
95%
88%
88%
87%
87%
89%
88%
96%
95%
96%
95%
89%
91%
87%
89%
85%
89%
89%
90%
93%
93%
92%
91%
74%
75%
70%
68%
74%
73%
93%
91%
93%
92%
74%
80%
71%
74%
66%
75%
74%
76%
83%
84%
82%
76%
39%
39%
33%
27%
39%
35%
84%
79%
83%
80%
37%
48%
35%
37%
26%
38%
35%
40%
4
Table S4. General sequence and coverage statistics.
Statistics
Average
Min
Max
Raw Mb
523
367
748
Unique Mb
Unique %
On –target
Average Coverage
1x Cov.
10x Cov.
20x Cov.
40x Cov.
236
46%
7%
42
98%
91%
80%
53%
147
32%
5%
28
98%
85%
66%
26%
361
63%
8%
63
99%
96%
93%
84%
5
Table S5. Individual single nucleotide variant statistics.
Type of variant
heterozygous
homozygous
Ratio Het/Hom
Non-coding
Coding
syn
nsyn
Ratio S/NS
nonsense
dbSNP129
not in dbSNP129
%Novel
Ti/Tv
TOT SNVs
DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DVT DV DVT DVT DVT DVT DVT DVT DVT DVT DVT
_P_ _P_ _P_ _P_ _P_ _P_ _P_ _P_ _P_ _P_ _C_ _C_ T_C _C_ _C_0 _C_0 _C_0 _C_0 _C_0 _C_1 _C_1 _C_1
01 02 03 04 05 06 07 08 09 10 01 02 _03 04
5
6
7
8
9
0
1
2
367
324
356
326
292
315
254
275
277
264
366
339
359
344
283
273
260
262
267
311
324
272
141
187
166
153
138
127
136
143
145
172
178
165
151
164
150
155
154
160
140
157
116
158
2.60
1.73
2.14 2.13 2.12
2.48 1.87
1.92
1.91 1.53
2.06 2.05 2.38
2.10
1.89
1.76
1.69
1.64
1.91
1.98
2.79
1.72
258
246
238
233
166
174
164
175
170
188
254
235
248
232
184
194
172
186
178
197
183
178
250
265
284
246
264
268
226
243
252
248
290
269
262
276
249
234
242
236
229
271
257
252
150
153
159
145
158
156
135
155
153
148
167
170
157
176
146
136
142
139
132
166
154
145
100
112
125
101
106
112
91
88
99
100
123
99
105
100
103
98
100
97
97
105
103
107
1.39 1.48 1.76 1.55 1.48 1.36 1.72 1.50
1.76
1.42
1.39
1.42
1.43
1.36
1.58
1.50
1.36
1.50 1.37 1.27 1.44 1.49
0
0
1
0
1
0
0
1
1
0
0
0
0
0
1
0
0
0
0
2
1
0
469
461
477
429
404
412
366
390
396
406
489
464
461
462
398
404
382
392
387
442
419
405
39
50
45
50
26
30
24
28
26
30
55
40
49
46
35
24
32
30
20
26
21
25
8
10
9
10
6
7
6
7
6
7
10
8
10
9
8
6
8
7
5
6
5
6
2.60 2.28 2.81 2.57 2.71
2.71 2.82 2.73 3.06 2.60 2.32 2.68 2.64
2.74
3.01
2.72
2.66
2.94
3.42
3.03
2.96
2.55
508
442
508
433
428
414
422
407
468
440
430
511
522
479
430
390
418
422
436
544
504
510
6
Table S6. General single nucleotide variant statistics.
Type of variant
heterozygous
homozygous
Ratio Het/Hom
Non-coding
Coding
syn
nsyn
Ratio S/NS
nonsense
dbSNP129
not in dbSNP129
%Novel
Ti/Tv
TOT SNVs
Average
Range
305
153
2
202
255
152
103
1.48
0
423
34
7
2.75
458
254-367
116-187
2-3
164-258
226-290
132-176
88-125
1.27-1.76
0-2
366-489
20-55
5-10
2.28-3.42
390-544
Table S7. Indel statistics.
Type of variant Average
Min
Max
insertions
deletions
3
5
0
1
7
10
homozygous
heterozygous
1
7
0
2
3
13
non-coding
coding
7
1
1
0
14
2
frameshift
in frame
1
0.2
0
0
2
1
total
8
2
15
7
Table S8. Variants present in human gene mutation database, HGMD®.
Chromo
Coordinate
some
Minor
Allele
Major
Allele
Gene
Functional
annotation
dbSNP
Associated disease
Association with
thrombotic
disease
Association with
DVT or DVTassociated
phenotype
chr1
55301775
G
A
PCSK9
Missense
rs505151
Atherosclerosis, severity, association with
Yes
No
chr1
167765599
C
T
F5
Missense
rs6030
Thrombosis ?
Yes
Yes
chr1
167778379
C
T
F5
Missense
rs4524
Thrombosis, increased risk, association with
Yes
Yes
chr1
167788473
C
T
F5
Missense
novel
Thrombosis ?
Yes
Yes
chr1
167831970
A
C
SELP
Missense
rs6133
Atopy, increased risk, association with
No
No
chr1
167832937
C
T
SELP
Missense
novel
Higher platelet SELP measures, association with
No
No
chr1
195297644
C
T
F13B
Missense
rs6003
Myocardial infarction, risk, association with
Yes
No
chr1
205694316
C
T
CR2
Regulatory
rs3813946
Increased transcriptional activity, association
No
No
chr4
155731347
T
C
FGA
Regulatory
rs2070011
Venous thromboembolism, suscep., association with
Yes
Yes
chr4
187357205
C
A
CYP4V2
Missense
rs13146272
Deep vein thrombosis, reduced risk, association with
Yes
Yes
chr5
148186633
A
G
ADRB2
Missense
rs1042713
Asthma, nocturnal, association with
No
No
chr5
148186666
G
C
ADRB2
Missense
rs1042714
Obesity, association with
No
Yes
chr5
176769138
A
G
F12
Regulatory
rs1801020
Premature myocardial infarction, association with
No
No
chr7
93881175
C
G
COL1A2
Missense
rs42524
Intracranial aneurysm, suscept., assoc. with
No
No
chr7
150327044
T
G
NOS3
Missense
rs1799983
Coronary spasm, association with
Yes
No
8
Table S8. (continued)
Chromo
Coordinate
some
Minor
Allele
Major
Allele
Gene
Functional
annotation
dbSNP
Associated disease
Association
with
thrombotic
disease
Atherosclerosis, severity, association with
Yes
Association
with DVT or
DVTassociated
phenotype
No
chr1
55301775
G
A
PCSK9
Missense
rs505151
chr1
167765599
C
T
F5
Missense
rs6030
Thrombosis ?
Yes
Yes
chr1
167778379
C
T
F5
Missense
rs4524
Thrombosis, increased risk, association with
Yes
Yes
chr1
167788473
C
T
F5
Missense
novel
Thrombosis ?
Yes
Yes
chr1
167831970
A
C
SELP
Missense
rs6133
Atopy, increased risk, association with
No
No
chr1
167832937
C
T
SELP
Missense
novel
Higher platelet SELP measures, association with
No
No
chr1
195297644
C
T
F13B
Missense
rs6003
Myocardial infarction, risk, association with
Yes
No
chr1
205694316
C
T
CR2
Regulatory
rs3813946
Increased transcriptional activity, association
No
No
chr4
155731347
T
C
FGA
Regulatory
rs2070011
Venous thromboembolism, suscep., association with
Yes
Yes
chr4
187357205
C
A
CYP4V2
Missense
rs13146272
Deep vein thrombosis, reduced risk, association with
Yes
Yes
chr5
148186633
A
G
ADRB2
Missense
rs1042713
Asthma, nocturnal, association with
No
No
chr5
148186666
G
C
ADRB2
Missense
rs1042714
Obesity, association with
No
Yes
chr5
176769138
A
G
F12
Regulatory
rs1801020
Premature myocardial infarction, association with
No
No
chr7
93881175
C
G
COL1A2
Missense
rs42524
Intracranial aneurysm, suscept., assoc. with
No
No
chr7
150327044
T
G
NOS3
Missense
rs1799983
Coronary spasm, association with
Yes
No
9
Table S9. Nonsynonymous variants in coagulation genes. Annotations and allele counts in the next generation sequencing
experiments are reported.
Gene
Chrom
osome
FGA
chr4
FGB
chr4
F2
F3
chr11
chr1
F5
chr1
F7
chr13
Coordinate
Referenc
e allele
Variant
allele
155726496
155726824
155727010
155727040
155706593
155711209
46701579
94768650
167750185
167751391
167765599
167776742
167777353
167778179
167778358
167778379
167778502
167785736
167788477
167808137
112818013
112820770
112821160
C
C
T
T
C
G
C
C
T
A
T
G
C
T
T
T
T
C
A
C
G
G
G
T
T
A
C
T
A
T
T
C
G
C
A
A
C
C
C
G
T
G
G
A
A
A
Transcript ID Protein change
NM_000508
NM_005141
NM_000506
NM_001993
NM_000130
NM_000131
p.R512K
p.A403T
p.T341S
p.T331A
p.P100S
p.R478K
p.T165M
p.G281E
p.D2222G
p.M2148T
p.M1764V
p.P1404S
p.S1200I
p.K925E
p.H865R
p.K858R
p.N817T
p.R513K
p.M413T
p.D107H
p.G157S
p.R283Q
p.R413Q
dbSNP129
novel
novel
novel
rs6050
rs2227434
rs4220
rs5896
rs3789683
rs6027
rs9332701
rs6030
rs9332608
novel
rs6032
rs4525
rs4524
rs6018
rs6020
rs6033
rs6019
novel
novel
rs6046
1000Genomes
CEU
population, AF
not present
not present
not present
0.217
not present
0.225
0.083
not present
0.033
0.017
0.25
0.042
not present
0.225
0.225
0.225
0.025
not present
0.033
0.05
not present
not present
0.1
SIFT
Ben
Ben
Ben
Ben
Dam
Ben
Dam
Ben
Dam
Dam
Ben
Dam
Dam
Ben
Ben
Ben
Dam
Ben
Ben
Dam
Ben
Ben
Ben
Polyp Alleles Alleles
hen 2 cases controls
Ben
Ben
Ben
Pod
Ben
Ben
Pod
Ben
Prd
Prd
Ben
Ben
Pod
Ben
Ben
Ben
Ben
Ben
Ben
Ben
Prd
Pod
Ben
0
0
1
9
1
3
1
0
2
0
4
1
1
2
3
3
2
0
2
3
1
0
3
1
1
0
4
1
9
1
1
4
1
7
0
0
4
4
3
6
1
4
1
0
1
1
10
Table S9. (continued)
Gene
Chrom
osome
F8
F9
chrX
chrX
F12
chr5
F13A
chr6
F13B
chr1
Coordinate
Referenc
e allele
Variant
allele
153811479
138460946
176763563
176763842
176764432
176764772
6097136
6097139
6119865
6263794
195292774
195292912
195297644
G
A
G
G
C
G
C
C
G
C
T
A
C
C
G
C
A
G
C
G
T
A
A
A
G
T
Transcript ID
NM_000132
NM_000133
NM_000505
NM_000129
NM_001994
Protein
change
p.D1260E
p.T194A
p.P385A
p.P327S
p.A207P
p.L140V
p.E652Q
p.V651I
p.P565L
p.V35L
p.E388V
p.I342T
p.R115H
1000Genomes
Polyp Alleles Alleles
CEU
SIFT
hen 2 cases controls
population, AF
rs1800291
not available Ben Ben
0
2
rs6048
not present
Ben Ben
1
5
novel
not present
Dam Prd
0
1
novel
not present
Ben Ben
1
0
rs17876030
0.008
Ben Pod
1
2
rs35515200
0.017
Ben Pod
1
0
rs5988
0.233
Ben Ben
4
5
rs5987
0.042
Ben Ben
0
2
rs5982
0.217
Ben Ben
4
5
rs5985
0.2
Ben Ben
5
4
rs5991
not present
Ben Pod
1
0
rs17514281
0.008
Dam Ben
0
1
rs6003
0.058
Ben Ben
1
2
dbSNP129
The 1000Genomes CEU population field reports the annotation of variants in the 1000Genomes database; in case the variant was present, the allele frequency of
the variant in the CEU population is reported. In SIFT and Polyphen 2 annotation results, Ben indicates predicted benign; Dam, potentially damaging according
to SIFT; Pod, possibly damaging according to Polyphen 2, Prd, probably damaging according to Polyphen 2.
11
Table S10. Novel missense variants identified during replication.
Ref
Base
Var
Base
155726925
C
T
p.S369N
155726929
C
T
p.G368R
155726935
C
T
155727054
G
T
155727127
G
C
p.P302A
155727156
G
A
p.A292V
80474389
C
T
Chromosome Coordinate
Chr4
chr16
Gene
FGA
PLCG2
Functional
effect
p.E366K
p.S326Y
p.P236L
12
Figure S1. Patient selection flowchart.
2139
referred for
lower limb DVT
1765
available for inclusion
730
idiopathic DVT
374
withdrawn consent
or DNA not
available
1035
secondary DVT
Selection criteria
(see main text)
11 (of 42 eligible)
selected for
next-generation
sequencing
13
Figure S2. Nxtgen2plink.rb workflow summary.
A)
Nxtgen2plink.rb WORKFLOW
1) The program generates a variable_sites file with all sites at
which a variant was detected at least in one individual in the
entire cohort
2) For each individual 3 files are generated:
calls – good quality variants in a modified .pileup format
filtered – variants eliminated after calling (StrandBias or
AlleleBalance)
pileup – coverage for all the .variable_sites at which a variant
was not present in the individual
SNP GENOTYPING
B)
3) The program interrogates the individual files to generate the individual
genotypes at all the variable_sites.
For each variable site:
The site is in the calls file
Yes
There’s a good variant!
Generate Het or Hom variant genotype
Yes
Genotype uncertain at the site!
Generate missing genotype
≥8X
No variant at the site!
Generate wild-type genotype from
the reference genome
No
The site is in the filtered file
No
Assess coverage at the site
in the pileup file
<8X
Insufficient coverage!
Generate missing genotype
14
Figure S2. (continued)
C)
OUTPUT
4) The phenotypic information (gender, case/control status) is
included in the output:
PATIENT_ID GENDER
DVT_P_9833 1
DVT_NC_128 1
CASE/CTRL_STATUS
1
0
SNP_1_chr1_12112837
AA
00
SNP_2_chr3_1236167434
GT
GG
...
…
…
5) The output file is used to generate PLINK-compatible files.
PLINK is then used for all sorts of association analyses:
-Calculate MAFs (in the cohort and in case/control groups)
-Genotype/phenotype association analysis (can be restricted to variants
in certain MAF range or with certain genotype missingness)
-Calculate missing genotypes per individual, per variant, overall.
-Assess relatedness between individuals and population stratification
15
Figure S3. Representative coverage histograms. Coverage (x-axis) is plotted against number of
reads (y-axis).
DVT_P_01
DVT_P_02
14000
14000
12000
12000
10000
10000
8000
8000
6000
6000
4000
4000
2000
2000
0
0
0
20
40
60
80
100
120
0
DVT_P_03
DVT_P_04
14000
14000
12000
12000
10000
10000
8000
8000
6000
6000
4000
4000
2000
2000
0
20
40
60
80
100
120
20
40
60
80
100
120
0
0
20
40
60
80
100
120
0
16
Download