Supplementary Fig. S1. A long PCR scheme for retrieval of missing

advertisement
Supplementary Fig. S1. A long PCR scheme for retrieval of missing sequence corresponding
to a breaking point in a presumed misassembled region at the end of a pseudomolecule.
Supplementary Fig. S2. Diversity along soybean chromosomes 1 to 20. The horizontal axes are Mbp coordinated along the Williams 82
reference genome and approximate centromere positions proposed by Schmutz et al. 2010 are denoted by the thick arrows. Shown for each
chromosome are relationship between physical and genetic positions (cM, black line), and corresponding recombination rates (red line,
cM/Mb) calculated from 100-kb sliding windows for the genomic regions between adjacent markers (top panel), numbers of genes per 100
kb (black line) and numbers of transposable elements (TE) per 100 kb (middle panel), and numbers of single nucleotide polymorphic (SNP)
sites per 100 kb (left y axis) and percentages of shared SNPs of Hwangkeum per the number of SNPs for IT182832 (bottom panel). In the top
panel, the discrepant regions between the genetic and sequence-based physical maps are denoted by discontinuities.
green lines represent Williams 82K, blue lines IT182932, red lines Hwangkeum, and black lines % shared SNP.
In the bottom panel,
Supplementary Fig. S3. Single nucleotide divergence between soybean variants. The
approximate 3.2 million single nucleotide differences among three soybean accessions, which
were identified by aligning short reads against the Williams 82 reference genome sequence,
were classified into shared and unique variations. The overlapping regions represent variants
shared between two variants or all variants.
Supplementary Table S2. Distribution of diversity or polymorphism informative content
(PIC) values from genotyping 258 indel markers onto a diverse set of 12 soybean variants
PIC value
Number of indel markers
0.1531
138
0.2779
27
0.2918
5
0.3750
20
0.3775
3
0.4029
4
0.4449
22
0.4862
16
0.4866
5
0.5000
6
0.5418
4
0.5695
1
0.5696
2
0.6112
2
0.6528
1
0.7085
1
0.7502
1
Supplementary Table S3. List of indel markers in marker intervals corresponding to putative introgression regions in Hwangkeum (Glycine
max) from G. soja where more than half of indel markers polymorphic between Williams 82 (G. max) and IT182932 (G. soja) were shown to
be monomorphic between Hwangkeum and IT182932
Indel marker in marker-
Genome
Polymorphism or
sparse interval
sequence
monomorphism
position
between Hwangkeum
Forward primer
Reverse primer
Allele size in
Williams 82
and IT182932
GMES0924 to Satt631(Gm03: 1310955.. 2916475)
GSINDEL19600
1516234
Polymorphism
GAATTGTATTCTGAAACAGC
CACTCAAATCCTACGTTTAC
161
GSINDEL19661
1684062
Monomorphism
TTTACTATGGCATGATTTCT
TTCACCTAGGTAATTTTGAA
143
GSINDEL19680
1742470
Monomorphism
TTACATTGTTCAATCCTACC
TTTTCTTCTTGCCTTTAGTA
152
GSINDEL19728
1908133
Monomorphism
GGATTTTTCAATTGATTTTA
TCATCTCTCTCCTAACAGAA
115
GSINDEL19743
1942952
Monomorphism
TGCATTCCAATACTATTACC
AGTGATTTATGCTTTTTCAC
116
GSINDEL19792
2045321
Polymorphism
GTACTTCCATTAAAACATGC
ATGCTTTTGTTGTTGATTAT
243
GSINDEL19902
2324329
Monomorphism
GTCCTCTGAACAATAAACTG
TACACCGATTCCTTTAAATA
216
GSINDEL19990
2558325
Polymorphism
AATGGTTCACAAAACTTAGA
TATCACAGAAGAAGAGGCTA
248
Satt316 to Sca-364a (Gm06: 47485161.. 48661496)
GSINDEL56203
47572099
Monomorphism
AAATAAGCAATAGGCACTAA
GTTTTTAATTATGAGGCAAA
130
GSINDEL56300
47798156
Monomorphism
ATACGTGGCAATAGTATGAT
CAAGATTTTGAGTTAAGGTG
122
GSINDEL56311
47820269
Monomorphism
TGAGAAATCGTTTATTTCAT
CTTTGTTTTTCTTAAGGTGA
179
GSINDEL56321
47837905
Polymorphism
CTTGTTTTTGTTGATTCTTC
TTAACCTATTTTCTGTCCAA
200
GSINDEL56371
47956076
Monomorphism
TATTTCTTTTGAAACAGACC
TACTCTTCCCTTTGTCATTA
180
GSINDEL56385
47986268
Monomorphism
GAATAAGAAAGAGAGGAAGC
TAGGGGAAAATGAAGACTA
177
GSINDEL56400
48014768
Monomorphism
ATACATTTCATTTCATCCAG
AAGTTTCACGTCAGTTAAAA
TGACAACTAAAATGACAATAA
CATTTGACATTGCTATTATG
GSINDEL56414
48074822
Monomorphism
GSINDEL56439
48178039
Monomorphism
GATCCAACTTACCATAATCA
TCAAAAATAAAATGGAGTGT
198
GSINDEL56531
48415540
Polymorphism
TCTTCAATTCCGAATACTAA
ATATATCAACGAAATGCTTC
161
A
160
132
GMES1600 to GMES6736 (Gm09: 39751797.. 41849066)
GSINDEL84893
40031613
Polymorphism
CAATTTTTAAACAAGCTCAA
AGTCTTTTCATGTTATGCAC
195
GSINDEL85007
40364072
Monomorphism
ACCAGCAACACATTATTTAT
TGCTGAACTGTCTTCTACTT
210
GSINDEL85023
40412092
Monomorphism
GTAACACGACACAAACTTCT
GAACAAAATGAAAATATGCT
163
GSINDEL85030
40419277
Monomorphism
GAATGAATGAATGTTTGTTT
GGTAGTGAATTACAACCAAG
130
GSINDEL85048
40498520
Polymorphism
GTAAGGACTAAGGATAAAGC
CTTTCAAGCTGGATTTGAC
204
GSINDEL85121
40623731
Monomorphism
ACTGTGTTGTTAGCATTTTT
CCAACTCGTCAACTCTATT
187
GSINDEL85147
40673473
Monomorphism
AAAGAGTTGCATTACAAGAG
CTTCCCTTTTCTTCTTTTAT
134
GSINDEL85155
40694764
Monomorphism
TGCCATATCTTATCTTTTGT
GGACTGTGTACTTGATAGGA
141
GSINDEL85193
40763413
Monomorphism
GACTCTTCTTCTGTCTCCTC
TTTTAATTGGGTGAGAGTAA
169
GSINDEL85269
41096510
Monomorphism
TATGCTCGTACTGAGATTTT
GAGAGTGATCCATTCAAAG
232
GSINDEL85271
41103592
Monomorphism
ACTCAGGAGATTCTTGAAAT
GCTAGTCAATTGGAAACAT
134
GSINDEL85272
41109897
Monomorphism
TATACACCGAGCTTAATAGG
AAGACCTTCAGTACAGTTCA
137
GSINDEL85282
41140003
Monomorphism
ATTTACCATGAGCAGATTTA
CTTGGTCCAATCTTAGTGT
232
GSINDEL85324
41280723
Monomorphism
GTGCCACTTATGTGAGATAC
AAAAACTTTGATATTGTGGA
130
GSINDEL85332
41294386
Polymorphism
TATACACAAAGTTGCACAAA
ATGTCACTCAAAATAGATGC
179
GSINDEL85333
41295818
Monomorphism
GAGGGGATATCTGTGTATCT
CTTCACTTGGTGATAGAGAA
244
GSINDEL85335
41297158
Polymorphism
ACGTGAAAAGTGTCTCTAAA
TTCATCTTCTCCTTTTCATA
216
GSINDEL85359
41384266
Monomorphism
ATTGAAGAGTCCTCTACCTC
GTAGCTAGCATTTCAAGAAG
183
GSINDEL85435
41537237
Polymorphism
CCCATGACTCTTATCTCATA
GATACTTGGGAAGAGAAAGT
229
A203.p1to Sca_189b (Gm15: 7469277.. 8817679)
GSINDEL136839
7777973
Monomorphism
AAAAGAGTGCATAATGATTT
ATTTCCAAGATTTTTCTTTT
117
GSINDEL136934
8075682
Polymorphism
TCTCAAAATAAAAATGGAAG
TTATCAAATAACAAGGGAAT
131
GSINDEL136982
8167899
Monomorphism
ACAAATCCAGCAAACTATA
CTTAGGAAATTCATTTGATG
156
GSINDEL137010
8288840
Polymorphism
CTTTGCAAAATAAGTTTAGG
CTTTTTCTCTCAATTTTTCA
195
GSINDEL137069
8507084
Monomorphism
TGGAATTTTCTGAAATAAAG
TAATCTCAAGAGGAGATGAA
174
GSINDEL137071
8509641
Monomorphism
TTTAGATAACCTTCCTCACA
TTCACAGTAGGTTAGACGTT
171
GSINDEL137103
8609958
Polymorphism
CATAAGGGAGGGTAATACTT
TTAATTGATCCATGTTCATC
133
GSINDEL137141
8721390
Monomorphism
TTGGTGGTATCACTAACTTT
ATTTAGGCTTAGGGTCTAAC
141
Supplementary Table S4. List of primer sets for long PCR amplification of breaking points of discrepant segments between the current
genetic map and Williams 82 genome sequence assembly (Glyma1) and GenBank accession numbers of sequences of the retrieved sequences
Discrepant segment
Primer set
Forward primer sequence
Reverse primer sequence
GenBank
accession
number
Chr 5
Chr5A1
GCAACGTTTGTCTTCGTTCA
GTTAATCTCGCCGGAAAATTG
JQ924191
Chr 11
BE020413-
JQ924190
BE020413
AGTTAAGATATGTTGCTTGG
AGTGTTTGTTGTATGGTTGT
Chr11B1 (primary
GGCCACTTCTGGAATCGTAA
GCCCCACTGGAAGTATTTGA
GACTCGGTGACACCATAAGT
GTGAATTGTGTACGGGTTTT
Chr14B2
GAACATATATGGGGTGCATGA
CATTCTACGCTAGAAGCTGAA
JQ924192
Upper
Scaffold41-be
ATATGCCACCCAAATAAAAA
GTTTGGGTGAAAAACAAGAG
JQ924193
Lower
Scaffold41-end
CCAGACAAAAGAGAAAGTGG
GGAAGGACAAGGGTTATTTT
JQ924194
Chr19L
GAAGGATACAAGTGAAAAAGTACAA
GATGTAGACAACATATCCCCTTC
JQ924195
containing 5’ end
3’ end
PCR)
Chr11B1-2 (nested
PCR)
Chr 14
Insertion site of
unplaced scaffold_41
on Chr 17
Chr 19
Supplementary Table S5. Summary of mapping by chromosome
Chromosome
Number of
Distance
cM/marker
Physical
kb/marker
number
markers
(cM)
1
128
107.9
0.8
55.9
436.7
1.9
2
113
138.6
1.2
51.7
457.5
2.7
3
67
113.2
1.7
47.8
713.4
2.4
4
59
105.7
1.8
49.2
833.9
2.1
5
79
110.0
1.4
41.9
530.4
2.6
6
74
125.1
1.7
50.7
685.1
2.5
7
64
114.9
1.8
44.6
696.9
2.6
8
93
155.3
1.7
47.0
505.4
3.3
9
73
109.4
1.5
46.8
641.1
2.3
10
80
142.6
1.8
51.0
637.5
2.8
11
76
133.0
1.8
39.2
515.8
3.4
12
72
106.1
1.5
40.1
556.9
2.6
13
110
115.3
1.0
44.4
403.6
2.6
14
76
113.4
1.5
49.7
653.9
2.3
15
78
124.9
1.6
50.9
652.6
2.5
16
65
91.9
1.4
37.4
575.4
2.5
17
64
119.2
1.9
41.9
654.7
2.8
18
73
116.8
1.6
62.3
853.4
1.9
19
75
112.0
1.5
50.6
674.7
2.2
20
62
105.9
1.7
46.8
754.8
2.3
Total/average
1581
2361.2
1.5
950.0
600.9
2.5
length (Mb)
Recombination
rate (cM/Mb)
Supplementary Table S6. Comparison of recombination rate of plants with sequenced genomes
Species name (common
name)
Arabidopsis thaliana
Arabidopsis lyrata
Fragaria vesca
(strawberry)
Brachypodium distachyon
Brassica rapa (pak choi)
Medicago truncatula
Oryza sativa (rice)
Solanum tuberosum
(potato)
Sorghum bicolor
(sorghum)
Glycine max (soybean)
Predicted
Number of
Sequenced
Transposable Sequenced
Genetic map Recombination Adjusted
Referencesd
genome size chromosomes genome size
element (TE) genome size length (cM) rate (cM/Mb)a recombination
(Mb)
(Mb)
content (%)
excluding TE
rate (cM/Mb)b
125
5
119
16
102
597
5.0
5.9
1, 2, 3
230
8
207
23
159
515
2.5
3.2
2, 4, 5
240
7
210
22
164
559
2.7
3.4
6
355
485
454
405
844
5
10
8
12
12
272
284
375
389
727
26
40
30
35
62
201
170
263
264
276
1598
1123
567
1530
762c
5.9
4.0
1.5
3.9
1.1
8
6.6
2.2
5.8
2.8
7, 8
9, 10
11, 12
13, 14
15, 16
748
10
730
62
277
1059
1.5
3.8
17, 18
1115
20
937
57
402
2361
2.5
5.9
19,
this study
Zea mays (maize)
2300
10
2300
85
345
2349
1.0
6.8
20, 21
a
Calculated by dividing the map length by the sequenced genome size. We presume that, as the sequenced genomes of rice and Arabidopsis demonstrated tendency of
overestimating genome size by flow cytometry [Arabidopsis Genome Initiative (2000); International Rice Genome Sequencing Project (2005)], the sequenced genome
sizes are more accurate than the predicted genome sizes.
b
c
Calculated by dividing the map length by the sequenced genome size excluding TE.
Average genetic length of 751 cM for the maternal map and 773 cM for the paternal map.
d
References
1. Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815
2. Hollister JD, Smith LM, Guo YL, Ott F, Weigel D, Gaut BS (2011) Transposable elements and small RNAs contribute to gene expression divergence between
Arabidopsis thaliana and Arabidopsis lyrata. Proc Natl Acad Sci USA 108:2322-2327
3. Lister C, Dean C (1993) Recombinant inbred lines for mapping RFLP and phenotypic markers in Arabidopsis thaliana. Plant J 4: 745–750
4. Hu TT, Pattyn P, Bakker EG, Cao J, Cheng JF, Clark RM, Fahlgren N, Fawcett JA, Grimwood J, Gundlach H, Haberer G, Hollister JD, Ossowski S, Ottilar RP, Salamov
AA, Schneeberger K, Spannagl M, Wang X, Yang L, Nasrallah ME, Bergelson J, Carrington JC, Gaut BS, Schmutz J, Mayer KFX, Van de Peer Y, Grigoriev IV,
Nordborg M, Weigel D, Guo Y-L (2011) The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet 43:476-481
5. Kuittinen H, de Haan AA, Vogl C, Oikarinen S, Leppälä J, Koch M, Mitchell-Olds T, Langley CH, Savolainen O (2004) Comparing the linkage maps of the close
relatives Arabidopsis lyrata and A. thaliana. Genetics 168:1575-1584
6. Shulaev V, Sargent DJ, Crowhurst RN, Mockler TC, Folkerts O, Delcher AL, Jaiswal P, Mockaitis K, Liston A, Mane SP, Burns P, Davis TM, Slovin JP, Bassil N,
Hellens RP, Evans C, Harkins T, Kodira C, Desany B, Crasta OR, Jensen RV, Allan AC, Michael TP, Setubal JC, Celton J-M, Rees DJG, Williams KP, Holt SH, Rojas
JJR, Chatterjee M, Liu B, Silva H, Meisel L, Adato A, Filichkin SA, Troggio M, Viola R, Ashman TL, Wang H, Dharmawardhana P, Elser J, Raja R, Priest HD,
Bryant DW Jr, Fox SE, Givan SA, Wilhelm LJ, Naithani S, Christoffels A, Salama DY, Carter J, Lopez Girona E, Zdepski A, Wang W, Kerstetter RA, Schwab W,
Korban SS, Davik J, Monfort A, Denoyes-Rothan B, Arus P, Mittler R, Flinn B, Aharoni A, Bennetzen JL, Salzberg SL, Dickerman AW, Velasco R, Borodovsky M,
Veilleux RE, Folta KM (2011) The genome of woodland strawberry (Fragaria vesca). Nat Genet 43:109-116
7. Huo N, Garvin DF, You FM, McMahon S, Luo MC, Gu YQ, Lazo GR, Vogel JP. (2011) Comparison of a high-density genetic linkage map to genome features in the
model grass Brachypodium distachyon. Theor Appl Genet. 123:455-64
8. International Brachypodium Initiative (2010) Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463:763–768
9. Kim H, Choi SR, Bae J, Hong CP, Lee SY, Hossain MJ, Van Nguyen D, Jin M, Park BS, Bang JW, Bancroft I, Lim YP. (2009) Sequenced BAC anchored reference
genetic map that reconciles the ten individual chromosomes of Brassica rapa. BMC Genomics 10:432
10. The Brassica rapa Genome Sequencing Project Consortium (2011) The genome of the mesopolyploid crop species Brassica rapa. Nat Genet 43:1035-1039
11. Mun JH, Kim DJ, Choi HK, Gish J, Debellé F, Mudge J, Denny R, Endré G, Saurat O, Dudez AM, Kiss GB, Roe B, Young ND, Cook DR (2006) Distribution of
microsatellites in the genome of Medicago truncatula: A resource of genetic markers that integrate genetic and physical maps. Genetics 172:2541-2555
12. Young ND, Debellé F, Oldroyd GE, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KF, Gouzy J, Schoof H, Van de Peer Y, Proost S, Cook DR, Meyers BC,
Spannagl M, Cheung F, De Mita S, Krishnakumar V, Gundlach H, Zhou S, Mudge J, Bharti AK, Murray JD, Naoumkina MA, Rosen B, Silverstein KA, Tang H,
Rombauts S, Zhao PX, Zhou P, Barbe V, Bardou P, Bechner M, Bellec A, Berger A, Bergès H, Bidwell S, Bisseling T, Choisne N, Couloux A, Denny R, Deshpande S,
Dai X, Doyle JJ, Dudez AM, Farmer AD, Fouteau S, Franken C, Gibelin C, Gish J, Goldstein S, González AJ, Green PJ, Hallab A, Hartog M, Hua A, Humphray SJ,
Jeong DH, Jing Y, Jöcker A, Kenton SM, Kim DJ, Klee K, Lai H, Lang C, Lin S, Macmil SL, Magdelenat G, Matthews L, McCorrison J, Monaghan EL, Mun JH,
Najar FZ, Nicholson C, Noirot C, O'Bleness M, Paule CR, Poulain J, Prion F, Qin B, Qu C, Retzel EF, Riddle C, Sallet E, Samain S, Samson N, Sanders I, Saurat O,
Scarpelli C, Schiex T, Segurens B, Severin AJ, Sherrier DJ, Shi R, Sims S, Singer SR, Sinharoy S, Sterck L, Viollet A, Wang BB, Wang K, Wang M, Wang X,
Warfsmann J, Weissenbach J, White DD, White JD, Wiley GB, Wincker P, Xing Y, Yang L, Yao Z, Ying F, Zhai J, Zhou L, Zuber A, Dénarié J, Dixon RA, May GD,
Schwartz DC, Rogers J, Quétier F, Town CD, Roe BA (2011) The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 480:520-524
13. Harushima Y, Yano M, Shomura A, Sato M, Shimano T, Kuboki Y, Yamamoto T, Lin SY, Antonio BA, Parco A, Kajiya H, Huang N, Yamamoto K, Nagamura Y,
Kurata N, Khush GS, Sasaki T (1998) A high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics 148(1):479-494
14. International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800
15. The Potato Genome Sequencing Consortium (2011) Genome sequence and analysis of the tuber crop potato. Nature 474:189-195
16. van Os H, Andrzejewski S, Bakker E, Barrena I, Bryan GJ, Caromel B, Ghareeb B, Isidore E, de Jong W, van Koert P, Lefebvre V, Milbourne D, Ritter E, van der
Voort JN, Rousselle-Bourgeois F, van Vliet J, Waugh R, Visser RG, Bakker J, van Eck HJ (2006) Construction of a 10,000-marker ultradense genetic recombination
map of potato: providing a framework for accelerated gene isolation and a genomewide physical map. Genetics 173:1075-1087
17. Bowers JE, Abbey C, Anderson S, Chang C, Draye X, Hoppe AH, Jessup R, Lemke C, Lennington J, Li Z, Lin Y, Liu S, Luo L, Marler BS, Ming R, Mitchell SE,
Qiang D, Reischmann K, Schulze SR, Skinner DN, Wang Y, Kresovich S, Schertz KF, and Paterson AH (2003) A high-density genetic recombination map of
sequence-tagged sites for Sorghum, as a framework for comparative structural and evolutionary genomics of tropical grains and grasses. Genetics 165:367–386
18. Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A, Schmutz J, Spannagl M, Tang H, Wang
X, Wicker T, Bharti AK, Chapman J, Feltus FA, Gowik U, Grigoriev IV, Lyons E, Maher CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov AA, Wang
Y, Zhang L, Carpita NC, Freeling M, Gingle AR, Hash CT, Keller B, Klein P, Kresovich S, McCann MC, Ming R, Peterson DG, Mehboob Ur R, Ware D, WesthoVP,
Mayer KFX, Messing J, Rokhsar DS (2009) The Sorghum bicolor genome and the diversification of grasses. Nature 457: 551–556
19. Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T,
Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill
N, Joshi T, Libault M, Sethuraman A, Zhang X-C, Shinozaki K, Nguyen HT, Wing RA, Cregan PB, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC,
Jackson SA (2010) Genome sequence of the palaeopolyploid soybean. Nature 463(7278):178-183
20. Liu S, Yeh C-T, Ji T, Ying K, Wu H, Tang HM, Fu Y, Nettleton D, Schnable PS (2009) Mu transposon insertion sites and meiotic recombination events co-localize with
epigenetic marks for open chromatin across the maize genome. PLoS Genetics 5: e1000733.
21. Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C,
Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B,
Chen W, Yan L, Higginbotham J, Cardenas M, Waligorski J, Applebaum E, Phelps L, Falcone J, Kanchi K, Thane T, Scimone A, Thane N, Henke J, Wang T, Ruppert
J, Shah N, Rotter K, Hodges J, Ingenthron E, Cordes M, Kohlberg S, Sgro J, Delgado B, Mead K, Chinwalla A, Leonard S, Crouse K, Collura K, Kudrna D, Currie J,
He R, Angelova A, Rajasekar S, Mueller T, Lomeli R, Scara G, Ko A, Delaney K, Wissotski M, Lopez G, Campos D, Braidotti M, Ashley E, Golser W, Kim H, Lee S,
Lin J, Dujmic Z, Kim W, Talag J, Zuccolo A, Fan C, Sebastian A, Kramer M, Spiegel L, Nascimento L, Zutavern T, Miller B, Ambroise C, Muller S, Spooner W,
Narechania A, Ren L, Wei S, Kumari S, Faga B, Levy MJ, McMahan L, Van Buren P, Vaughn MW, Ying K, Yeh CT, Emrich SJ, Jia Y, Kalyanaraman A, Hsia AP,
Barbazuk WB, Baucom RS, Brutnell TP, Carpita NC, Chaparro C, Chia JM, Deragon JM, Estill JC, Fu Y, Jeddeloh JA, Han Y, Lee H, Li P, Lisch DR, Liu S, Liu Z,
Nagel DH, McCann MC, SanMiguel P, Myers AM, Nettleton D, Nguyen J, Penning BW, Ponnala L, Schneider KL, Schwartz DC, Sharma A, Soderlund C, Springer
NM, Sun Q, Wang H, Waterman M, Westerman R, Wolfgruber TK, Yang L, Yu Y, Zhang L, Zhou S, Zhu Q, Bennetzen JL, Dawe RK, Jiang J, Jiang N, Presting GG,
Wessler SR, Aluru S, Martienssen RA, Clifton SW, McCombie WR, Wing RA, Wilson RK. (2009) The B73 maize genome: complexity, diversity, and dynamics.
Science 326:1112-1115
Supplementary Table S7. Summary of sequencing and variations for three soybean varieties
Variety
Category
IT182932
Hwangkeum
Williams 82
Mapping
Total bases
36,880,852,541
19,983,900,227
16,544,562,819
Mean depth
38.82
21.03
17.41
%_bases_above_1
97.4
97.4
98.5
%_bases_above_5
93.2
95.9
97.6
%_bases_above_10
85.5
92.0
89.8
%_bases_above_20
69.2
56.2
34.2
Total
2,397,205
1,236,277
113,587
Known
1,365,216
817,605
44,566
Homozygous
2,286,168
1,165,945
51,454
Heterozygous
111,037
70,332
62,133
1,575,531
785,647
63,381
Transversion
821,674
450,630
50,206
Exon
116,121
55,029
5,610
Exon known
73,935
41,882
2,008
Exon novel
42,186
13,147
3,602
Exon homozygous
108,819
50,081
2,268
Exon heterozygous
7,302
4,948
3,342
Exon transition
68,941
32,637
3,081
Exon transversion
47,180
22,392
2,529
CDS
85,330
40,543
4,388
5’ UTR
9,706
4,420
423
3’ UTR
21,886
10,468
856
Intron
236,471
115,952
9,295
Silent
38,587
18,489
1,932
Missense
45,798
21,624
2,401
Nonsense
896
414
50
Readthrough
131
65
9
Splice_site
701
357
50
Start_codon
162
68
10
Total
302,013
236,276
29,105
Homozygous
286,097
224,203
20,269
Coveragea
SNP
Transition
Indel
Heterozygous
15,916
12,073
8,836
213,098
173,403
22,621
Exon
14,072
7,127
1,162
Exon novel
14,072
7,127
1,162
Exon homozygous
13,191
6,714
930
Exon heterozygous
881
413
232
Exon tandem_repeat
9,885
5,171
917
CDS
4,879
2,452
655
5’ UTR
3,476
1,633
199
3’ UTR
5,775
3,073
316
47,994
28,876
3,285
Frameshift
2,772
1,531
595
Inframe
2,107
921
60
Splice_site
264
139
31
Start_codon
85
41
5
Tandem_repeat
Intron
a
Percentage of the Williams 82 reference genome sequence covered with short reads higher than 1, 5, 10, and 20
Download