file - BioMed Central

advertisement
SUPPLEMENTARY MATERIAL
Detecting evidence of positive selection across Asia
with sparse genotype data from the
HUGO Pan-Asian SNP Consortium
Xuanyao Liu, Woei-Yuh Saw, Mohammad Ali, Rick Twee-Hee Ong, Yik-Ying Teo
CONTENTS
1
Supplmentary Methods
1.1
Quantifying over-representation of height genes
2
2
2
Supplementary figures
3
3
Supplementary tables
6
4
References
10
1
1
Supplementary Methods
1.1
Quantifying over-representation of height genes
Of the 59 genomic regions identified by haploPS to be positively selected in the 31 PASNP population
groupings, 30 regions containing a total of 3,518 genes were found to possess at least one height-associated
gene. Given that there have been more genome-wide association studies (GWAS) in height, particularly
those involving hundreds of thousands of samples, we wanted to evaluate whether there was any evidence
of over-representation for positive selection in height-related genes in the PASNP populations. As of 24
June 2013, there were 279 genes reported to be associated with height in the NHGRI GWAS catalogue1,
against a baseline of 28,906 genes in the autosomal chromosomes of the human genome. The 30 positively
selected regions contained 58 height-related genes, and a one-sided Binomial test yielded a p-value of 9.98
× 10-5 that this observation was due to chance.
2
2
Supplementary figures
Supplementary Figure 1. Evidence of positive selection by iHS in HGDP populations
Evidence of positive selection by iHS and XP-EHH for the populations in the Human Genome Diversity
Project at chromosome 2. The figures were obtained from the HGDP Selection Browser maintained by the
Pritchard Lab at http://hgdp.uchicago.edu/cgi-bin/gbrowse/HGDP/ using input coordinates of
Chr2:196,841,741..197,997,071.
3
Supplementary Figure 2. Selected haplotype forms in 12 PASNP population groupings
HaploPS identified the extended haplotypes that presented evidence of positive selection at chromosome 2
between 196.8Mb and 198.0Mb in 12 of the 31 PASNP population groupings. By extracting the haplotypes
at frequencies ranging between 45% and 85% in the respective populations, we can infer that the selection
signals likely stem from the same evolutionary event prior to the divergence of the populations as the
selected haplotypes were perfectly identical and yielded a haplotype similarity index (HSI) of 1.00.
4
Supplementary Figure 3. HaploPS evidence around the HBB locus
The horizontal axis of each panel shows the genetic distance in centimorgans spanned by the longest
haplotype at 10% frequency across the genome, while the vertical axis shows the number of SNPs spanned
by the corresponding haplotype. Thailand indigenous 1 refers to the PASNP populations from Thailand of
H’Tin, Mlabri, Plang, Karen and Lawa ethnicities and the China Wa ethnic group; while Thailand
indigenous 2 refers to the PASNP populations from Thailand of Tai Lue, Tai Yong, Tai Kern and Tai Yuan
ethnicities.
5
3
Supplementary tables
Supplementary Table 1. Labeling and characteristics of the populations in PASNP. This table is
adapted from Figure 1 in the original PASNP publication2.
Grouping
China Group 1
China Group 2
China Group 3
China Group 4
China Han
Indonesia Group 1
Indonesia Group 2
Indonesia Group 3
Indonesia Group 4
Indonesia Group 5
Indonesia Group 6
India Group 1
India Group 2
India Group 3
India Group 4
Japan Main
Japan Okinawa
Korean
Malaysia Group 1
Malaysia Group 2
Malaysia Negrito
Philippines Group 1
Labels
CN-GA
CN-HM
CN-JI
CN-CC
CN-UG
CN-SH
CHB
ID-SB
ID-RA
ID-SO
ID-LA
ID-LE
ID-AL
AX-ME
ID-TR
ID-MT
ID-ML
ID-KR
ID-TB
ID-DY
ID-SU
ID-JA
ID-JV
MY-BD
IN-NI
IN-TB
IN-DR
SG-ID
IN-WI
IN-EL
IN-SP
IN-WL
IN-IL
IN-NL
JP-ML
JPT
JP-RK
KR-KR
MY-KN
MY-MN
SG-MY
MY-TM
MY-JH
MY-KS
PI-MA
PI-UI
PI-UN
PI-UB
Country
China
China
China
China
China
China
China
Indonesia
Indonesia
Indonesia
Indonesia
Indonesia
Indonesia
Pacific
Indonesia
Indoensia
Indonesia
Indonesia
Indonesia
Indonesia
Indonesia
Indonesia
Indonesia
Malaysia
India
India
India
Singapore
India
India
India
India
India
India
Japan
Japan
Japan
Korea
Malaysia
Malaysia
Singapore
Malaysia
Malaysia
Malaysia
Philippines
Philippines
Philippines
Philippines
Ethnicity
Han
Hmong
Jiamao
Zhuang
Uyghur
Han
Han
Kambera
Manggarai
Manggarai
Lamaholot
Lembata
Alorese
Melanesian
Toraja
Mentawai
Malay
Batak Karo
Batak
Dayak
Sudanese
Javanese
Javanese
Bidayuh
Tharu
Ladakhi
Upper Caste
India Origin
Bhil
Upper Caste
Upper Caste
Upper Caste
Upper Caste
Upper Caste
Japanese
Japanese
Ryukyuan
Korean
Malay
Malay
Malay
Proto-Malay
Negrito
Negrito
Manobo
Urban
Urban
Urban
6
Language
Cantonese
Hmong
Jiamao
Zhuang
Uyghur
Chinese
Chinese
Kambera
Manggarai
Manggarai
Lamaholot
Lembata
Alor
Nasioi
Toraja
Mentawai
Malay
Batak Karo
Batak Toba
Benuak
Sunda
Javanese
Javanese
Jagoi
Pahari
Spiti
Telugu
Tamil
Bhili
Bengali
Hindi
Marathi
Hindi
Hindi
Japanese
Japanese
Okinawan
Korean
Malay
Minangkabau
Malay
Temuan
Jehai
Kensiu
Manobo
Visaya
Tagalog
Ilocano
# samples
30
26
31
26
26
21
45
20
17
19
20
19
19
5
20
15
12
17
20
12
25
34
19
50
20
23
24
30
25
16
23
14
15
15
71
44
49
90
30
20
18
49
50
30
18
20
19
20
Grouping
Philippines Negrito
Singapore Chinese
Taiwan Indigenous
Taiwan Main
Thailand Group 1
Thailand Group 2
Thailand Group 3
Thailand Group 4
Thailand Group 5
Labels
PI-AT
PI-IR
PI-MW
PI-AG
PI-AE
SG-CH
AX-AM
AX-AT
TW-HA
TW-YA
TH-HM
TH-YA
TH-TY
TH-TL
TH-TK
TH-TU
TH-TN
TH-MA
TH-PP
TH-KA
TH-LW
CN-WA
TH-PL
CN-JN
TH-MO
Country
Philippines
Philippines
Philippines
Philippines
Philippines
Singapore
Taiwan
Taiwan
Taiwan
Taiwan
Thailand
Thailand
Thailand
Thailand
Thailand
Thailand
Thailand
Thailand
Thailand
Thailand
Thailand
China
Thailand
China
Thailand
Ethnicity
Negrito
Negrito
Negrito
Negrito
Negrito
Han
Ami
Atayal
Han
Han
Hmong
Yao
Tai Yong
Tai Lue
Tai Kern
Tai Yuan
H’Tin
Mlabri
Plang
Karen
Lawa
Wa
Palong
Jinuo
Mon
7
Language
Ati
Iraya
Mamanwa
Agta
Aeta
MinNan
Ami
Atayal
Hakka
MinNan
Hmong
Iu-Mien
Tai Yong
Lue
Tai Kern
Tai Yuan
Mal
Mlabri
Blang
Karen
Lawa
Wa
Palong
Jinuo
Mon
# samples
23
9
19
8
8
30
10
10
32
48
20
19
18
20
18
20
18
18
18
20
19
56
18
29
19
Supplementary Table 2. Regions identified by haploPS to be positively selected in the 31 PASNP
population groupings. The start and end coordinates for each region are reported in NCBI Build 36
coordinates.
8
chr
1
startpos
31,628,061
endpos
59,351,170
pop_names
Thailand_Group5
freq_all
0.05
1
63,039,308
76,275,854
0.05
1
1
1
145,720,880 148,360,642
156,412,238 158,989,174
170,718,710 173,602,174
Malaysia_Negritos
Philippine_Negritos
Indonesia_Group1
Indonesia_Group2
Taiwan_Indigenous
Taiwan_Indigenous
1
2
181,292,574 197,417,076
8,624,853
9,653,517
Malaysia_Negritos
0.05
Japan_Okinawa Korean 0.9 0.8
2
16,676,509
18,526,577
0.25
2
42,409,140
48,888,233
Malaysia_Negritos
Malaysia_Group2
India_Group2
Japan_Okinawa
2
84,292,129
85,819,280
2
108,085,215 109,051,831
2
2
113,751,097 133,237,701
166,064,007 169,579,101
haploPS_start haploPS_end haploPS_pop
50,442,680
57,644,365
CHB, CHD, JPT, MAS
CHB, CHD, CHS, JPT,
64,112,389
76,257,415
MAS
0.85 0.75
0.65
145,830,625
0.2
NA
0.35
170,912,521
149,622,482
NA
171,363,586
CHB
NA
CHB, CHD, JPT
NA
9,270,479
NA
9,654,635
17,314,637
17,897,251
NA
CHB, CHD, CHS, JPT
CHB, CHD, CHS, JPT,
MAS
0.75 0.05
0.75
43,282,562
43,958,429
Singapore_Chinese
China_Group2
China_Han
0.7
84,222,108
84,995,378
0.85 0.9
108,286,633
108,910,484
0.05
0.1
118,670,171
NA
131,247,669
NA
0.45 0.6
0.55 0.85
0.75 0.55
0.5 0.65
0.45 0.8
0.5 0.65 196,982,600
197,749,763
top genes identified by Fst
ZMYM6, PIK3R3
DNAJC6,SLC44A5
NA
FCRL1,CD5L
FMO6P,FMO2
RGL1,EDEM3,HMCN1,DKFZp
762L185
NA
NA
CHD, CHS, JPT, MAS
CHB, CHD, CHS, JPT,
MAS
NA
CHB, CHD, CHS, JPT
CHB, CHD, CHS, JPT,
MAS
NA
SULT1C3
NA
2
196,841,741 197,997,071
China_Group2
Thailand_Group1
Indonesia_Group1
Malaysia_Group2
Malaysia_Group1
Indonesia_Group5
Indonesia_Group2
Thailand_Group3
India_Group2
Indonesia_Group3
Indonesia_Group6
Thailand_Group1
Philippine_Group1
Japan_Main
2
208,624,365 213,556,972
Thailand_Group4
0.1
208,754,345
213,267,474
3
39,148,534
55,438,612
0.05
44,121,527
49,710,750
3
56,851,823
73,194,663
0.05 0.05 58,038,371
73,054,439
3
3
103,332,218 123,321,009
126,537,059 134,498,594
Thailand_Group1
China_Group2
Thailand_Group1
Thailand_Group4
Japan_Okinawa
China_Group2
CEU India_Group2
Indonesia_Group6
China_Group1
Philippine_Group1
Thailand_Group3
Philippine_Negritos
Taiwan_Indigenous
Philippine_Negritos
Indonesia_Group5
0.05 0.05 103,385,279
0.1
NA
121,909,440
NA
JPT
CHB, CHD, CHS, JPT,
MAS
NA
0.5 0.8
0.8 0.8
0.8
0.05
0.05
0.05
32,843,099
135,551,435
NA
NA
34,025,092
135,768,238
NA
NA
CHB, CHD, CHS, GIH,
INS, JPT, MAS
CHB, JPT, MAS
NA
NA
NA
NA
VEGFC,IRF2
NA
0.1 0.05
56,766,427
56,785,344
CHS
NA
4
4
4
5
32,709,603
134,211,631
175,225,231
5,472,363
34,377,130
142,886,432
187,552,805
13,945,346
5
50,304,726
83,541,796
Supplementary Table 2 continued.
9
CHB, CHD, CHS, GIH,
INS, JPT, MAS
CHB, CHD, CHS, JPT,
MAS
CHB, CHD, CHS, JPT,
MAS
NA
NA
NA
ERBB4
NISCH
CADPS
BC035247
COL29A1,CPNE4
5
88,817,624
109,333,621
5
117,327,664 117,922,801
5
144,740,153 145,341,692
6
6
17,538,157
47,152,974
38,019,008
52,441,050
6
6
6
7
7
7
65,310,924
138,951,795
151,793,601
8,149,845
13,708,216
79,973,591
93,068,220
143,385,463
161,155,132
12,656,000
25,381,591
97,825,437
7
8
8
98,319,021 133,654,097
53,312,825 69,718,665
103,015,890 116,977,822
8
9
9
118,469,843 136,264,032
1,409,638
1,608,419
78,984,573 86,304,688
10
20,686,445
30,470,812
10
43,089,450
72,592,373
10
10
11
12
12
13
90,013,013
109,273,134
2,745,638
12,334,440
97,309,698
31,666,843
108,886,554
110,380,402
7,732,438
24,752,300
105,781,266
38,937,812
13
56,834,700
70,731,938
13
79,516,916
91,765,949
14
44,308,706
64,031,232
14
15
75,475,877
25,947,855
89,115,878
27,654,486
15
18
36,269,405
55,392,483
67,966,469
64,773,274
19
8,214,446
13,425,865
20
21
36,150,356
22,287,335
59,287,333
26,182,624
Indonesia_Group5
China_Group2
Thailand_Group5
Malaysia_Group1
Thailand_Group4
Thailand_Group3
China_Group1
China_Han
Philippine_Negritos
Taiwan_Indigenous
Malaysia_Group2
India_Group1
Taiwan_Indigenous
Malaysia_Negritos
Thailand_Group5
Thailand_Group4
Taiwan_Indigenous
Thailand_Group3
Indonesia_Group5
Thailand_Group1
Malaysia_Negritos
Thailand_Group5
Malaysia_Group2
Indonesia_Group5
Taiwan_Indigenous
Malaysia_Group2
Thailand_Group4
Indonesia_Group3
Indonesia_Group3
Malaysia_Group2
Philippine_Negritos
Indonesia_Group5
Taiwan_Indigenous
Thailand_Group3
Taiwan_Indigenous
China_Group2
Taiwan_Indigenous
China_Group2
Thailand_Group5
Malaysia_Group2
Malaysia_Negritos
China_Group2
Philippine_Negritos
Malaysia_Negritos
Indonesia_Group6
Indonesia_Group3
Thailand_Group1
Taiwan_Indigenous
Philippine_Negritos
Malaysia_Group2
Indonesia_Group4
Malaysia_Group2
Indonesia_Group5
Thailand_Group3
China_Group2
China_Group1 Korean
Japan_Main
Thailand_Group5
Indonesia_Group3
Philippine_Negritos
0.1 0.05
NA
NA
NA
RGMB,BC042169,CHD1,SLCO
6A1
0.9 0.9
0.9 0.9
0.9 0.9
117,373,324
117,701,041
CHB, CHD, CHS, JPT,
MAS
NA
0.8 0.85
NA
NA
NA
0.05 0.05
0.15
26,232,282
0.15
NA
35,693,077
NA
0.05 0.05
0.1
0.1
0.05
0.05
0.05
0.05
88,878,455
NA
NA
NA
NA
NA
69,850,800
NA
NA
NA
NA
NA
CHB, CHD, CHS, GIH,
INS, JPT, MAS
RNF144B,SCGN,HLADRB5,HLADRB6,COL11A2,RXRB,SLC39
A7
OPN5,PKHD1
MAS
NA
NA
NA
NA
NA
CHB, CHD, CHS, JPT,
MAS
MAS
CHS
BAI3
NA
NOX3
THSD7A
NA
NA
NA
NA
NA
0.15 0.05 111,910,801
0.05
66,841,775
0.1
111,526,351
131,264,503
67,126,543
111,757,796
0.05 0.05 129,601,852
0.9
NA
0.1
NA
129,666,059
NA
NA
0.05
22,715,863
24,435,153
0.05 0.05 55,537,663
65,892,784
0.05 0.05
0.4
0.1
0.05
0.1
0.1
92,947,202
NA
NA
NA
98,739,698
NA
107,547,879
NA
NA
NA
99,296,032
NA
CHB, CHD, JPT
NA
NA
CHB, CHD, CHS, JPT,
MAS
CHB, CHD, CHS, JPT,
MAS
CHB, CHD, CHS, JPT,
MAS
NA
NA
NA
CHB, CHD, CHS, JPT
NA
0.1 0.05
60,292,838
60,493,599
JPT, MAS
DIAPH3
0.1 0.1
0.05
NA
NA
NA
NA
0.05 0.1
0.1
49,021,100
49,473,932
JPT
C14orf106,BMP4
0.05 0.1
0.85
76,948,436
NA
77,059,691
NA
CHS
NA
KIAA0743,NRXN3
NA
62,904,700
NA
CHB, CHD, CHS, JPT,
MAS
NA
NA
LOC390858
NA
NA
NA
52,648,939
NA
CHD, JPT
NA
GCNT7,C20orf85
NA
0.05 0.05
0.05
40,233,690
0.05
NA
0.85 0.8
0.85
NA
0.05 0.05 52,491,840
0.1
NA
10
C7orf58,EU233817,PLXNA4
NA
NA
KIAA1217,PRINS
CSGALNACT2,PRKG1
BC035398
GRIN2B,CR623725
STARD13,DCLK1
4
References
1.
Hindorff, L.A. et al. Potential etiologic and functional implications of genome-wide association loci
for human diseases and traits. Proc Natl Acad Sci U S A 106, 9362-7 (2009).
Abdulla, M.A. et al. Mapping human genetic diversity in Asia. Science 326, 1541-5 (2009).
2.
11
Download