mec13182-sup-0001-Supinfo

advertisement
1
Table S1: Genotyped individuals with the 60K SNP array.
Country
(ISO code)
USA (US)
Mexico (MX)
Cuba (CU)
Population/
Location
Code
N
Mulefoot
NA
MUL
18
Ossabaw pig
Ossabaw island
OB
7
Yucatan
Indiana
YU
10
Guinea Hog
Several locations
GH
15
Cuino
Nayarit
CU
7
Hairless
Several locations
HL
8
Pinar del Río
(West)
WE
12
Sancti Spiritus
(Centre)
CE
1
Granma (East)
EA
5
breed
Creole
Guadeloupe (GP)
Creole
Guadeloupe
CR
4
Guatemala (GU)
Creole
Baja Verapaz,
Salamá
CR
14
Creole
Guanacaste,
Alajuela
CR
12
Coco
Coco island
IC
12
Zungo
Cereté (Córdoba)
ZU
10
Creole
Alto Baudó
(Chocó)
CR
11
Ecuador (EC)
Creole
Loja
CR
4
Peru (PE)
Creole
Titicaca area
CR
16
Moura
Concórdia
MO
9
Monteiro
Poconé
MT
10
Piau
Bahia
PU
9
Nilo
Goias
NI
2
Bolivia (BO)
Creole
Oruro
CR
3
Paraguay (PY)
Feral pig
San Pedro
FP
3
Creole
Misiones
MS
9
Feral pig
Esteros del Iberá
FP
6
Semi feral
Formosa
FO
10
Cerdo Pampa
Rocha
CP
1
Iberian
Several locations
IB
16
Canarian
Canary islands
CA
4
Costa Rica (CR)
Colombia (CO)
Brazil (BR)
Argentina (AR)
Uruguay (UY)
Spain (ES)
2
Portugal (PT)
Bisaro
Several locations
BI
14
Italy (IT)
Black Sicilian
Sicily
SI
4
Wild boar
Several locations
WB
13
Duroc
Several locations
DU
20
Landrace
Several locations
LR
20
Landrace
Several locations
LW
20
Hampshire
Several locations
HS
14
Jiangquhai
Jiangsu
JQ
11
Jinhua
Zheijiang
JH
17
Xiang pig
Guizhou
XI
13
Meishan
Jiangsu
MS
17
Poland (PL),
Hungary (HU),
Tunisia (TN)
Denmark (DK),
Holland (NL), USA
Denmark (DK),
Holland (NL), USA
Denmark (DK),
Holland (NL), USA
GB, USA
China (CN)
A full description and analysis of these populations is in Burgos-Paz et al. (2013).
The individual code in the genotype files provided in dryad is madeup of the country
ISO code, the breed code and a number. For international breed samples, the code
contains the breed code, the country if known, and a number.
Table S2: Simulation results summary with varying demographic parameters (real
mean = 1.91).
% Asian
admixture
t
12
25
25
25
50
1
1
4
6
6
a
b
a
Mean 
MSEb
1.69
1.86
1.77
1.69
2.10
0.017
0.015
0.020
0.012
0.021
Number of generations between backcross and founder bottleneck.
Mean squared error between observed and simulated  across windows.
3
Table S3: List of genes in highly homozygous SSC8 region
Ensembl Gene ID
ENSSSCG00000008854
ENSSSCG00000008855
ENSSSCG00000008857
ENSSSCG00000008858
ENSSSCG00000020868
ENSSSCG00000008860
ENSSSCG00000030208
ENSSSCG00000008862
ENSSSCG00000021559
ENSSSCG00000008863
ENSSSCG00000021125
ENSSSCG00000021928
ENSSSCG00000026546
ENSSSCG00000008866
ENSSSCG00000030504
ENSSSCG00000008869
ENSSSCG00000008868
ENSSSCG00000008867
ENSSSCG00000008870
ENSSSCG00000024960
ENSSSCG00000024153
ENSSSCG00000021506
ENSSSCG00000019051
ENSSSCG00000024685
Associated
Gene Namea
CPE
H3F3C
SC4MOL
KLHL2
(TMEM192)
(YBX1)
TRIM61
(TRIM60)
(TMEM192)
(TRIM60)
(TRIM60)
(TRIM60)
(TRIM60)
GUCY1A3
ASIC5
TDO2
CTSO
GUCY1B3
PDGFC
U12
GLRB
SNORA11
GRIA2
Gene Start
(bp)
45,395,481
45.490.027
45.557.528
45.678.240
45.932.754
45.970.426
45.990.930
46.043.239
46.125.730
46.150.254
46.166.029
46.178.462
46.234.403
46.456.920
46.567.887
46.642.261
46.704.219
46.724.245
46.746.538
47.675.547
47.852.183
48.065.093
48.088.056
48.116.708
Gene End
(bp)
 ICb
 IB
 WB
 DU
 LW
 MS
45,592,351
45.490.455
45.618.352
45.785.868
45.947.825
45.971.348
45.992.312
46.044.558
46.127.733
46.151.555
46.167.414
46.179.637
46.235.785
46.537.331
46.605.587
46.670.059
46.721.468
46.739.145
46.765.486
47.774.058
47.852.328
48.071.919
48.088.180
48.277.325
0.57
2.55
0.32
0.48
0.76
0
1.20
1.73
1.26
1.26
0.76
0.88
0.85
0.41
0.61
0.28
0.38
0.30
0.46
0.43
0
0.08
0
0.24
0.98
1.27
1.13
0.42
3.22
2.96
2.76
0
0.70
0.84
0.79
0.46
1.91
0.13
0.38
0.49
0.38
0.46
0.27
0.17
0
0.08
0
0.05
2.89
0
2.12
2.33
3.67
0.59
1.13
0
1.38
0.42
0.39
0.93
0.72
0.25
0.28
0.51
0.34
0.29
0.24
0.19
0
0.08
4.36
0.22
0.14
0
0.13
0.06
0.00
0
0
0
NAc
0
0
0
0
0.12
0.03
0.09
0.20
0.08
0
0.06
0
0
0
0.07
0.01
0
0
0.08
0.25
0
1.53
0
NA
0
0
0
0
0.04
0
0.04
0
0.08
0.13
0
0
0
0
0.04
2.94
0
1.79
1.59
3.91
0
0
10.29
0
4.35
6.13
2.14
0.88
2.51
1.82
1.61
1.60
3.42
3.69
0.20
0
0.16
0
0.34
4
ENSSSCG00000023923
ENSSSCG00000008873
ENSSSCG00000008874
ENSSSCG00000008875
ENSSSCG00000028492
ENSSSCG00000008877
ENSSSCG00000008878
ENSSSCG00000021470
ENSSSCG00000008881
ENSSSCG00000021841
ENSSSCG00000008883
ENSSSCG00000020037
ENSSSCG00000027183
ENSSSCG00000028616
ENSSSCG00000021221
ENSSSCG00000008885
ENSSSCG00000026906
ENSSSCG00000018686
ENSSSCG00000025657
ENSSSCG00000021600
ENSSSCG00000008886
ENSSSCG00000008887
ENSSSCG00000008888
FAM198B
TMEM144
RXFP1
C4orf46
ETFDH
PPID
RAPGEF2
U6
(KLF5)
U6
(FSTL5)
(FSTL5)
(FSTL5)
(FSTL5)
SNORA18
(GAG-POL)
NAF1
NPY1R
48.273.864
49.264.227
49.361.034
49.914.550
50.009.519
50.013.290
50.060.927
50.353.948
50.390.540
50.421.693
50.868.840
52.668.577
52.841.580
53.082.188
53.243.631
53.594.678
53.795.890
54.284.641
54.307.762
54.512.171
54.567.202
55.108.280
55.280.140
48.273.975
49.325.615
49.430.751
50.005.946
50.013.268
50.060.767
50.075.150
50.354.027
50.684.112
50.421.799
50.869.723
52.668.683
52.897.316
53.101.899
53.381.033
53.594.923
53.936.894
54.284.756
54.308.037
54.563.745
54.569.151
55.148.770
55.290.347
a
Between parentesis, orthologous associated gene names.
b
: Watterson's theta estimator per kb
c NA: not enough coverage to evaluate variability in that gene
0
0.52
0.79
0.45
0
0.39
0.27
0
0.39
0
1.09
0
0.16
0.20
0.20
0
0.40
0
0
0
NA
0.44
0.41
0
0.14
0.33
0.24
0
0.18
0.04
0
0.37
0
6.77
0
0.20
0
0.23
0
0.12
0
0
2.73
0
0.04
0.27
0
0.86
1.46
0.20
0
0.12
0.12
0
0.79
10.20
4.36
0
0.34
0
0.61
0
0.26
0
0
8.33
0
0.31
0.47
0
0.10
0.12
0.04
0
0.05
0
0
0.06
0
8.13
0
0.14
0
0.07
0
0.12
0
NA
NA
NA
0.03
0.11
0
0.06
0
0.03
0
0.05
0.16
NA
0.04
0
0
0
0.06
0
0.05
0
0.07
0
NA
NA
NA
0
0
0
1.61
2.58
1.89
0.48
0.68
1.01
0
1.58
5.10
0
5.10
1.06
2.01
0.64
2.22
1.62
0
0
NA
NA
1.70
0.12
5
Fig. S1: NJ-tree based on 1-IBS distances from the 60k array genotypes for the worldwide diversity panel (Table S1).
6
Fig. S2: Simulated vs. observed variability across 2,000 random windows in Coco
pigs with the best fitting model in Table S2. The demographic model involved an
F1, followed by a backcross and one generation of panmixia before the bottleneck.
7
Fig. S3: Unsupervised ADMIXTURE analysis (K=13) with the 60k SNP data. Breed codes as in Table S1.
8
60
100
140
EHet
0
20 40 60 80
0.0 0.2 0.4 0.6 0.8 1.0
EHet
20
0.0 0.2 0.4 0.6 0.8 1.0
EHet
0
120
0
20
40
60
80
8
9
10
150
0
20 40 60 80
120
0
50
100
150
EHet
EHet
EHet
EHet
100
0
50
100
150
0
20
40
60
Window
11
12
13
14
15
40
60
80
10
20
30
40
50
60
0
50
100
150
200
EHet
EHet
EHet
0
0
50
100
Window
16
17
18
19
40
Window
60
80
10 20 30 40 50 60 70
Window
EHet
EHet
0
0
10
20
30
40
Window
50
60
150
0
50
100
150
Window
0.0 0.2 0.4 0.6 0.8 1.0
Window
0.0 0.2 0.4 0.6 0.8 1.0
Window
80
0.0 0.2 0.4 0.6 0.8 1.0
Window
0.0 0.2 0.4 0.6 0.8 1.0
Window
0.0 0.2 0.4 0.6 0.8 1.0
Window
0.0 0.2 0.4 0.6 0.8 1.0
Window
Window
100
0.0 0.2 0.4 0.6 0.8 1.0
7
0.0 0.2 0.4 0.6 0.8 1.0
6
0.0 0.2 0.4 0.6 0.8 1.0
Window
0.0 0.2 0.4 0.6 0.8 1.0
Window
EHet
20
150
Window
EHet
0
100
5
Window
50
20
50
0.0 0.2 0.4 0.6 0.8 1.0
EHet
0
0.0 0.2 0.4 0.6 0.8 1.0
0
0.0 0.2 0.4 0.6 0.8 1.0
250
4
Window
0.0 0.2 0.4 0.6 0.8 1.0
0
EHet
150
3
0.0 0.2 0.4 0.6 0.8 1.0
EHet
EHet
50
0.0 0.2 0.4 0.6 0.8 1.0
0
EHet
2
0.0 0.2 0.4 0.6 0.8 1.0
1
0
20
60
100
140
Window
Fig. S4: Heterozygosity of Coco samples per chromosome as observed from the 60k
SNP array data. Note the low heterozygosity region in SSC8, marked with an arrow.
Chromosome '19' corresponds to SSCX, also displaying an extremely low diversity
(see Table 1).
Download