1 1 Evaluation of Chinese Soybean Germplasm to Improve Soybean Production 2 and Utilization 3 Yao Guo, Mervyn G. Marasinghe and Reid G. Palmer* 4 Yao Guo, Department of Statistics, Iowa State University, Ames, Iowa 50011, U.S.A. 5 Mervyn G. Marasinghe, Department of Statistics, Iowa State University, Ames, Iowa 50011, 6 U.S.A. 7 Reid G. Palmer, USDA ARS CGR and Departments of Agronomy and Zoology/Genetics, Iowa 8 State University, Ames, Iowa, 50011, U.S.A. 9 Received _____________________. *Corresponding author (rpalmer@iastate.edu). 10 11 This is a joint contribution of the Iowa Agriculture and Home Economics Experiment Station, 12 Ames, Iowa, Project No. 3769, and from the USDA-Agricultural Research Service, Corn Insects 13 and Crop Genetics Research Unit, and supported by the Hatch Act and the State of Iowa. The 14 mention of a trademark or proprietary product does not constitute a guarantee or warranty of the 15 product by Iowa State University or the USDA, and the use of the name by Iowa State University 16 of the USDA implies no approval of the product to the exclusion of others that may also by 17 suitable. 18 19 Abbreviations: PIs, plant introductions; RFLP, restriction fragment length polymorphism; AFLP, 20 amplified fragment length polymorphisms; RAPD, random amplified polymorphic DNA; SSR, 21 simple sequence repeats; NAC, North American cultivars; CCG, central Chinese germplasm; 22 SCG, southern Chinese germplasm; UPGMA, un-weighted pair-group method using arithmetic 2 1 average; GD, genetic distance; PCA, principal coordinate analyses; MDS, multidimensional 2 scaling. 3 1 Abstract 2 3 Increased genetic homogeneity among modern soybean cultivars has brought more 4 attention to the potential use of plant introductions (PIs). North American soybean germplasm 5 derives from a limited number of ancestors. Some PIs carry traits that may be useful to add to the 6 gene pool of cultivated soybeans. So knowledge of genetic diversity patterns in plant PIs is 7 needed to diversify the North American soybean gene pool efficiently. North American cultivars 8 and two subsets of cultivars collected from the central provinces of China and the southern 9 provinces of China were used in this study. Genomic DNAs from these accessions were 10 evaluated by restriction fragment length polymorphisms (RFLP) with 33 selected probes. The 11 objectives of this research were to measure the variation among North American cultivars and 12 these two subsets of Chinese cultivars and to evaluate the relationships between geographical 13 origin and genetic diversity. Based on this study, there were clear genotypic distinctions among 14 collections from southern China, central China, and the North American cultivars. The southern 15 Chinese germplasm and the central Chinese germplasm were found to have many alleles not 16 observed in North American cultivars. Cluster analysis and principal coordinate analysis of the 17 genetic distance matrix revealed a clear distinction between the cultivars from southern China and 18 those from central China and North America, but only a weak separation between the collections 19 from central China and North America. This pattern suggested that the southern Chinese 20 cultivars might possess useful genetic diversity that could be exploited by soybean breeders to 21 transfer new genes into North American soybean cultivars. 4 Introduction 1 2 3 The objective of plant breeding programs is to develop improved cultivars by using elite 4 germplasm as parents. Efforts to diversify elite cultivars could increase the rate of quality 5 improvement. Hybridization of cultivars with genetically diverse lines and introgression into 6 cultivars are two ways to introduce high seed yield and other agronomic traits such as resistance 7 to disease, insect pests, and tolerance to poor environmental conditions. In contrast, limited or 8 reduced genetic diversity is a potential threat to breeders. 9 The soybean [G. max (L.) Merr.] germplasm in North America has a narrow genetic base. 10 The history of soybean cultivar development is relatively short. It was estimated that 80% of the 11 germplasm present in modern cultivars could be traced to just 12 ancestral lines, which were 12 introduced into the USA in the early 1900s (Lorenzen et al., 1995). The extensive use of a small 13 number of ancestors resulted in a limited genetic diversity in modern soybean cultivars. To 14 reduce the risks associated with a narrowed genetic base, new alleles are desired for incorporation 15 into modern cultivars. Plant introductions (PIs) and selection are potentially useful in alleviating 16 this problem. 17 Although often not carrying the most desired traits found in modern cultivars, PIs can 18 provide otherwise useful traits. PIs may possess resistance to various diseases or insects but may 19 have undesirable agronomic traits such as shattering pods or weak stems. Although PIs generally 20 have poor agronomic qualities, some have been shown to contain genes that improve yield 21 (Thompson et al., 1998). Recent reviews have enumerated the use of soybean genetic resources 5 1 for agronomic improvement (Hymowitz and Bernard, 1991; Hymowitz et al., 1998; Palmer et al., 2 1995; Singh and Hymowitz, 1998). 3 China is a historical center of genetic diversity for soybean, so it has been a very 4 important resource for soybean germplasm for North American cultivars (Hymowitz, 2003). In 5 contrast to this trend in the USA, on average, current Chinese cultivars share less than 2% of their 6 genes in common, on the basis of the coefficient of parentage analysis (Cui et al., 2000). This is 7 attributed to a continual infusion of new germplasm into applied programs and a strong tendency 8 to avoid the mating of close relatives (Cui et al., 2000). So knowledge of genetic diversity 9 patterns in Chinese soybean germplasm potentially represents a rich source of elite germplasms 10 and PIs for breeding stock. Since the geographical regions of central and southern China are very 11 different, we may expect the genetic diversity patterns to be different. 12 There are many ways to measure genetic diversity. Traditionally, the assessment of 13 genetic variation has been based on geographic origin and morphological characteristics. 14 Unfortunately, these variations may not be the best representatives of genetic diversity in the 15 entire genome because of the limitation of small numbers of phenotypic markers and the 16 interference of various human and environmental factors (Brown-Guedira, et al., 2000; Negash, 17 et al., 2002). With the development of molecular technology, molecular genetic markers have 18 become valuable tools to determine genetic diversity, since they are usually unaffected by the 19 environment and can often be generated in large numbers (Vosman et al., 1999). The more 20 common ones include AFLP (Amplified Fragment Length Polymorphisms), RAPD (Random 21 Amplified Polymorphic DNA), RFLP (Restriction Fragment Length Polymorphisms), and SSR 6 1 (Simple Sequence Repeats). They can be informative, but each has significant benefits and 2 drawbacks. 3 There are several benefits to using RFLP analysis to evaluate germplasm. RFLPs are 4 usually co-dominant markers and the relatively low copy number usually makes them easy to 5 ‘score’. The greatest drawbacks are that performing RFLP analysis is time consuming and 6 relatively costly. Selecting the RFLP probes that are known to detect diversity in soybeans could 7 reduce the cost involved. For this study, the use of RFLP probes allows comparisons to be made 8 to other soybean collections where RFLP data have already been collected. RFLPs are also highly 9 repeatable. AFLPs generally have a high copy number. This makes them more difficult to 10 ‘score’. 11 always believed to be highly reproducible, there is also evidence suggesting otherwise 12 (Thompson and Nelson, 1998). SSRs have also increased in popularity, but do not code for 13 specific regions of the genome. (Powell et al., 1996) 14 RAPDs are known to be easy to perform and cost efficient. Although they are not In our study, RFLPs have been employed to characterize 302 soybean germplasms from 15 North American and central and southern provinces of China. Our objectives were (1) to 16 measure the genetic diversity present in North American cultivars (NAC), central provinces of 17 China germplasm (CCG) and southern provinces of China germplasm (SCG); (2) to compare the 18 genetic relationships among them; (3) to screen the relationship between geographical origin and 19 genetic diversity; and (4) to determine if PIs selected from the CCG and SCG could be useful to 20 breeding programs in North America for selection of diverse parents. 7 Materials and Methods 1 2 3 Germplasm 4 The southern Chinese soybean germplasm collection contained 131 plant introductions 5 (PIs). These were collected from the seven provinces Anhui (15), Guangdong (8), Hubei (31), 6 Jiangsu (30), Shanghai (6), Sichuan (19), and Zhejiang (22). The soybean germplasm collected 7 from the eight central provinces of China Anhui (7), Gansu (26), Hebei (3), Henan(11), 8 Jinagsu(10), Shaanxi(19), Shangdong(3), and Shanxi(27), consisted of 106 Pis (see Fig.1). Note 9 that, geographically, the division of provinces into southern and central regions is based on the 10 11 Yangtze River. Both subsets were compared to the 65 lines of the North American cultivars. These 12 represent about 80% of the genetic diversity observed in the cultivars currently growing in North 13 America (Gizlice et al., 1994). 14 As a control, G. max breeding line A81-356022 and the wild annual G. soja (Sieb. E Z 15 vec.) plant introduction PI 468916 were used. The controls were used to create the 16 USDA/ARS/Iowa State University public molecular soybean linkage map. All alleles scored 17 were compared to the controls. 18 19 20 Probes The term “probe” refers to the recombinant DNA clone that detects complementary 21 restriction fragments. A single probe may detect multiple fragments at different loci (Keim et 22 al., 1992). The use of RFLP probes that are able to efficiently detect diversity in a given set of 8 1 germplasm is an important part of analyzing new germplasm. The probes included in this study 2 were previously determined to detect relatively high levels of genetic diversity in soybean 3 cultivars from North America and their ancestors (Lorenzen et al., 1995). 4 5 6 DNA Preparations and RFLP Hybridization Selected seed were planted in the greenhouse and grown to the second trifoliolate stage. 7 The leaves were collected and freeze-dried. DNA was extracted from the leaves using the 8 chloroform extraction method (Sambrook et al., 1989). DNA then was digested using restriction 9 enzymes: DraI, EcoRI, EcoRV, HindIII, and TaqI. Electrophoresis, Southern transfers, and 10 hybridizations were performed as in Sambrook et al. (1989). Thirty-three DNA probe/enzyme 11 combinations were used to detect diversity in soybean cultivars (Lorenzen et al., 1995). For each 12 probe used, at least one polymorphism was mapped to the USDA-ARS public soybean map (G. 13 max A81-356022 X G. soja PI 468916), the aforementioned control. 14 15 16 Method of Evaluating RFLP data Each probe was used to detect at least one mapped polymorphism on the USDA-ARS 17 public soybean G. max by G. soja genetic map (Shoemaker and Specht, 1995). Loci were 18 distributed throughout the soybean linkage map with at least one locus on each linkage group 19 except the Q linkage group (see appendix A). Data were scored according to an allele-locus 20 model suggested by Bruebaker and Wendel (1994). Soybean has primarily dominant or co- 21 dominant markers (Keim et al., 1989). The low copy number of RFLP fragments seen, as often- 22 observed in highly homozygous species facilitates the use of an allele-locus model. Most bands 9 1 were monomorphic (only one or two different bands were observed per locus). Any bands at a 2 given locus that were too difficult to score were recorded as a no score. Loci that were difficult 3 to score or had a high frequency of missing data points were discarded completely. 4 5 6 Gene Diversity Variation in populations can be characterized by heterozygosity or gene diversity, the 7 latter being more appropriate for inbred populations. Since for inbred populations, there are very 8 few heterozygotes but there may be several different homozygous types (Weir, 1996). The 9 number of alleles per locus was calculated by dividing the observed number of alleles for a group 10 of accessions by the number of loci surveyed. The frequency of a particular allele in a population 11 is called the gene or allele frequency. The proportion of polymorphic loci (P) was calculated by 12 dividing the number of polymorphic loci (those for which the frequency of the most abundant 13 allele was <0.950) by the number of loci surveyed. Considering only codominant alleles in a 14 diploid population, Nei’s (1987) diversity was computed for each RFLP locus as follows: 15 16 m ^ ^ hl 1 xil2 i 1 17 18 where m is the number of alleles and x̂il ’s represent frequency of allele Ai at locus l within the 19 total population, which is calculated as 20 21 xˆ i Xˆ ii Xˆ ij / 2 i j 10 1 where X^ ii is the frequency of Ai Ai and X^ ij is the frequency of Ai A j in the sample. 2 In a selfing population, ĥl is given by the following: 3 m 4 hˆl n(1 xˆ il2 ) /( n 1) i 1 5 6 where n is the number of individuals sampled. The mean gene diversity that occurred with a 7 ^ population was estimated by averaging hl estimates from this population over all loci r: 8 r 9 Hˆ hˆl / r l 1 10 11 ^ The sampling variance of H may be obtained by 12 13 V ( Hˆ ) V (hˆl ) / r 14 15 where V (hˆl ) is the variance of ĥl and is given by 16 r 17 V (hˆl ) (hˆl Hˆ ) 2 /( r 1) l 1 18 11 1 To calculate this value, the fragments observed that were synonymous to either of the 2 USDA-ARS-Iowa State public molecular genetic mapping: G. max X G. soja (Shoemaker and 3 Specht, 1995; Cregan et al., 1999), were combined to represent a single allele at a locus. For 4 example: for accession ‘i’ with a frequency of ‘a’ alleles like G. max (A81-356022), a frequency 5 of ‘b’ alleles like G. soja (PI 468916), a frequency of ‘c’ alleles like the first band unlike either of 6 the mapping parents, and a frequency of ‘d’ alleles different from either of the controls or other 7 “unique” alleles, the formula for gene diversity would be represented as follows. 8 hˆl 1 (a 2 b2 c 2 d 2 ) 9 10 ^ 11 Let Vl (h ) denote the variance for a particular locus. This is the variance generated at the 12 time of gene frequency survey for a particular locus. Therefore, if one is interested in testing the 13 difference in single-locus heterozygosity between two populations, this variance should be used 14 (Nei, 1987). This value is given by 15 16 ^ ^ Vl (h ) 2 ^ {2(2n 2)[ xi3 ( x^ i2 )2 ] ( x^ i2 )2 } 2n(2n 1) 17 18 19 Genetic Distances Genetic distances are designed to express the genetic differences between two populations 20 as a single number (Nei, 1987). If there are no differences, the distance could be set to zero, 21 whereas if the populations have no alleles in common at any locus the distance may be set equal 12 1 to its maximum value, say 1 (Weir, 1996). Genetic distances were estimated through RFLP data 2 by the method of Nei (1987). A “Genetic identity” I is defined between the two populations as 3 4 I Y X xˆ xˆ ilX ilY l i xˆ xˆ 2 ilX l i 2 ilY l i 5 6 where x̂ ilX and x̂ ilY are the frequencies of allele i at locus l within population X and Y, 7 respectively. Nei’s standard genetic distance D is defined as 8 9 D = - ln (I) 10 11 The genetic distance matrix can be obtained by using the POPGENE software version 12 1.31 (Yeh et al., 1999), Nei’s genetic identity is above the diagonal and genetic distance is below 13 the diagonal of the genetic distance matrix output by POPGENE. 14 15 16 Cluster Analysis and Multidimensional Scaling Relationships among varieties based on RFLP information were investigated using several 17 statistical programs (SAS Inc., 1990). Dendrograms were constructed from clustering and tree 18 analyses using UPGMA (un-weighted pair-group method using arithmetic averages) on the 19 distance matrices based on RFLP data. Furthermore, associations between individuals were 20 graphically depicted through principal coordinate analyses (PCA), which were constructed by 13 1 using the multidimensional scaling (MDS) procedure (Johnson and Wichern, 1998) in SAS. The 2 ABSOLUTE option was used in order to maintain the distance scale between 0 and 1, thus 3 making interpretation and graphing easier (Thompson et al, 1998). 4 Multidimensional scaling (MDS) is a procedure that starts with the ‘distances’ between a 5 set of points (or individuals or objects) and finds a configuration of the points, preferably in a 6 small number of dimensions, usually 2 or 3. Thus, in an attempt to represent the genetic 7 distances among populations in a suitable lower dimensional configuration that is reasonably 8 close to the original configuration, the MDS procedure was used on the genetic distance matrix. 9 The closeness is measured by “stress”, which measures the extent to which the new configuration 10 in the lower dimension deviates from the original. For a specified dimension, the MDS procedure 11 finds the configuration that minimizes “stress”. It is standard practice to obtain the minimum 12 stress for configurations of several dimensions and construct a plot called the “scree plot”, a plot 13 of stress against the dimension. Usually the value of the dimension at which the stress begins to 14 level-off is chosen as the “best” lower dimension. Stress is usually represented as a percentage 15 and a value between 0% and 5% is considered acceptable. 16 14 1 Results and Discussion 2 3 Identical thirty-three probes were used for the samples from the three regions used in this 4 study. Within region diversity was compared by estimating the average diversity across all loci 5 for each region (Table 2). The average diversity over all PIs was 0.47. The average diversity for 6 CCG was 0.41, higher than that for SCG (0.34) and NAC (0.37). The CCG was the most diverse 7 while the SCG was the least diverse. The lowest average diversity of SCG may be explained by 8 the narrow maturity range of the lines, limited geographic sampling, or nonrandom sampling. 9 The range of gene diversity was 0.03 to 0.74 for SCG versus 0.08 to 0.56 for CCG, and 0.04 to 10 11 0.60 for NAC (Fig. 2). We may use V (Hˆ ) to test the difference (0.41-0.34= 0.075) in average gene diversity Ĥ 12 between CCG and SCG populations. The standard error of this difference is s d = 0.03656. 13 Thus, t = 0.075/0.03656 = 1.94 (p-value>0.05), indicating that the difference is not statistically 14 significant. Similarly, the differences between CCG and NAC, NAC and SCG were not 15 significantly different. If all the PIs from Central China and North America, Central China and 16 Southern China, North America and Southern China were combined, then these combined 17 samples appeared to indicate that the combined populations have higher average diversities than a 18 single population. For example, the combination of CCG and SCG had average diversity of 0.46 19 (Table 2), larger than the simple average of CCG and SCG ((0.41+0.34)/2=0.375). These results 20 indicate that more diversity resides among CCG, NAC, and SCG populations than within them, 21 and that these populations are diverse from each other. Since the standard error for each locus in 22 a single population was small, this conclusion seems to be reasonable. 15 1 The CCG and SCG population were divided into subgroups according to provinces and a 2 gene diversity analysis carried out. The highest diversity, 0.40, for CCG focused on PIs from 3 Gansu province, which represents a diversity 0.41, almost that of the entire CCG population 4 (Table 3). And the second largest contribution was the Henan region, followed by Jiangsu, 5 Shanxi and Shaanxi. Also all five regions had the highest proportion of polymorphic loci (0.94), 6 except Henan (0.88). The highest diversity, 0.33, in SCG population was Sichuan province, 7 similar to the 0.34 of SCG population. Hubei and Jiangsu were second (0.31), and third (0.28), 8 respectively. Jiangsu had 0.94 polymorphic loci, and Sichuan and Hubei each had 0.91. The 9 measure of genetic variation had the same trend with the proportion of polymorphic loci. And it 10 is not surprising that Gansu and Shichuan represent great diversity for each CCG and SCG. 11 According to geographic conditions, Gansu province covers a narrow elongated area, more than 12 half of it is a mountain region and most farms are separate from each other. Sichuan, one of 13 Gansu’s neighbors, has a very complicated geographic shape. 14 Of the 33 probes screened, eight had two alleles, twenty-one had three alleles, three had 15 four alleles, and one had six alleles (Table 4). A total of 97 alleles were scored and 71% were 16 present in all three populations. Only 27 alleles had a frequency of 0.10 or less across all 17 populations and most of these were infrequent in each population (Table 4). For loci with more 18 than two alleles, the additional allele(s) had frequencies of 0.10 or less across all lines except 19 four. Fourteen of 27 infrequent alleles were present in CCG and 3 were present in NAC, while 20 16 were present in SCG. Thus, NAC lacked the allelic diversity found in CCG and SCG. CCG 21 and SCG have similar allelic diversities. 16 1 The PIs from CCG, NAC, and SCG possessed 85, 74, and 87 of the 97 alleles 2 respectively. For the 12 null alleles in CCG, five were present in NAC and nine were present in 3 SCG. Twenty-three alleles were null in NAC, sixteen were present in CCG and in SCG, 4 separately. SCG had ten null alleles: seven of them were present in CCG while three were 5 present in NAC (Table 5). This structure indicates that the PIs that come from CCG and SCG are 6 good sources to diversify the NAC gene pool. 7 Differentiation between populations was investigated through genetic distances (GD). 8 Estimates of genetic distances among CCG, NAC, and SCG populations as measured by RFLPs 9 are given in Table 6. As expected, the average GD between the CCG and SCG, SCG and NAC 10 were higher, 0.33 and 0.29 respectively, while the average GD between NAC and CCG was the 11 least, 0.13. 12 The associations among individuals sampled from populations were investigated by MDS 13 based genetic distance matrix (data not shown). The scree plot (Fig. 3) shows that the 14 configuration obtained with two dimensions is sufficient since the drop in stress from two to three 15 dimensions is only around 0.05%. Thus a plot of the first two MDS dimensions was created to 16 graphically represent the relationships among the 302 best-characterized sets of individuals. This 17 plot yielded a good separation between the collections of SCG and CCG, the collections of SCG 18 and NAC, but a weak separation between CCG and NAC. Most of the NACs scattered in CCGs, 19 but some of them were outliers from the CCGs (Fig. 5), also five varieties of NAC were disjunct 20 from the others and grouped together. These five cultivars were Kingwa (IV), Williams (III), 21 S100 (V) and two Tokyo’s (VII). One cultivar of SCG was at the left top of this graph and far 22 away from SCG group. It was PI 588040 and needs further investigation. 17 1 Cluster analyses can help to confirm the above conclusions. UPGMA cluster analyses 2 based on the GD matrix were accomplished. A dendrogram (data not shown) based on the GD 3 matrix of 302 accessions using the 33-probe set showed that accessions of the CCG and NAC 4 groups fell into three relatively distinct clusters according to the region of the origin. Accessions 5 from southern China formed a large cluster. Several accessions of North America grouped 6 together to form a small cluster. Accessions from central China and North America grouped 7 together and then sub-clustered to two. Some of NAC mingled with CCG and some NAC 8 grouped alone (Figure not shown). A dendrogram based on a matrix of the mean GD between 9 pairs of the 17 subgroups divided by provinces of China and NAC cultivars (Fig. 5) showed that 10 11 CCG and NAC groups were distinct from the SCG group. All of the analyses corresponded to the geographic origin classification. The soybeans 12 with identical pedigrees should have very similar genotypes (Keim et al. 1992). The cluster and 13 PCA patterns generally reflected the geographical origin of the major ancestors. The independent 14 clusters of SCG and NAC indicated that these germplasms used for cultivar development in 15 North America might be distinct from those of the southern provinces of China. From Table 7, 16 all of the North American ancestral lines came from China, especially the northwest region, 17 except for four lines that probably had ancestors from China, Japan, and Korea; eight ancestral 18 lines from China and Korea; two from China and Korea; two from Japan and Korea; one from 19 Korea. The geographical similarity between northern region and central area of China may 20 account for the tendency of grouping together. The large distance and different origin between 21 SCG and other germplasms suggested that soybean breeders can maximize genetic diversity in 22 existing populations by crossing between CCG and SCG, SCG and NAC. 18 1 A further investigation about the genetic structure focused on maturity groups. The 2 dendrogram derived from UPGMA cluster analysis based on the GD matrix for maturity groups 3 within provinces is presented in Fig. 6. The maturity groups clustered based on the provinces and 4 regions rather than the maturity groups. It seems that the separation noticed based upon the 5 provinces signifies the importance of their geographic boundaries. This in itself would make an 6 interesting study. UPGMA cluster analysis based on the GD matrix only for maturity groups of 7 all the 302 accessions showed that the maturity groups 0, I, II, III, IV, V and 00 distributed in one 8 cluster and VI, VII, VIII presented in another cluster (Fig. 7). The small number of sampled 9 accessions may explain the large distance between maturity groups 0 and 00. 10 In conclusion, the accessions from Southern China and Central China were found to have 11 many alleles not observed in North American cultivars. The MDS and cluster analysis revealed 12 that the accessions from southern China were strongly distinct from the North American 13 cultivars; some accessions of the North America cultivars were distributed within the accessions 14 from central Chinese cultivars. Clustering was distinguishable based upon province of origin. 15 Considering the generations of independent breeding done in the United States and the 16 reported levels of homogeneity of the North American cultivars (Gizlice et al. 1993, 1994), it is 17 expected that other collections would be different. Our results indicated that the Chinese 18 germplasm did display greater genetic diversity than the North American cultivars. The 19 accessions from Southern China are clearly genetically distinct from the North American 20 cultivars; they also appear to be much different than the germplasm collection from the central 21 provinces of China. These factors suggest that the Southern Chinese collection would potentially 22 be a good source for new allelic variation for modern cultivars. 23 19 References 1 2 3 Brown-Guedira, G.L., J.A. Thompson, R.L. Nelson, and M.L. Warburton. 2000. Evaluation of 4 genetic diversity of soybean introductions and North American ancestors using RAPD 5 and SSR markers. Crop Sci. 40:815-823. 6 Bruebaker, C.L., and J.F. Wendel, 1994. Reevaluation of the origin of domesticated cotton 7 (Gossypium hirsutum: Malvaceae) using nuclear restriction fragment length 8 polymorpaisms (RFLPs). Am. J. Bot. 81:1309-1326. 9 Cregan, P.B., T. Jarvick, A.L. Bush, R.C. Shoemaker, K.G. Lark, A.L. Kahler, N. Kaya, T.T. 10 Van Toai, D.G. Lohnes, J. Chung, and J. E. Specht. 1999. An integrated genetic linkage 11 map of the soybean. Crop Sci. 39:1464-1490. 12 13 14 Cui, Z., T.E. Carter, Jr., and J.W. Burton. 2000. Genetic diversity patterns in Chinese soybean cultivars based on coefficient of parentage. Crop Sci. 40:1780-1793. Gizlice, Z., T.E. Carter, Jr., and J.W. Burton. 1993. Genetic diversity in North American 15 soybean: I. Multivariate analysis of founding stock and relation to coefficient of 16 parentage. Crop Sci. 33:614-620. 17 18 Gizlice, Z., T.E. Carter, Jr., and J.W. Burton. 1994. Genetic base for North America public cultivars released between 1947 and 1988. Crop Sci. 34:1143-1151. 19 Hymowitz, T. and R.L. Bernard. 1991. Origin of the soybean and germplasm introduction and 20 development in North America. p. 147-164. In H.L. Shands and L.E. Wiesner (ed.) Use 21 of plant introductions in cultivar development. Part 1. CSSA special publication no. 17. 22 Madison, WI. 20 1 2 3 Hymowitz, T., R.J. Singh, and K.P. Kollipara. 1998. The genomes of the Glycine. Plant Breed. Rev. 16:289-317. Hymowitz, T. 2003. Speciation and cytogenetics. p. ? In H. R. Boema and J.E. Specht (ed.) 4 Soybeans: Improvement, production and uses. 3rd ed. Agron. Monogr. 16. ASA, CSSA, 5 and SSSA. Madison, WI. 6 7 8 9 10 11 12 Johnson, R.A., and D.W. Wichern, 1998. Applied Multivariate Statistical Analysis. 4th Edition, Prentice-Hall, New Jersey. Keim, P., R.C. Shoemaker, and R. G. Palmer. 1989. Restriction fragment length polymorphism diversity in soybean. Theor. Appl. Genet. 77:786-792. Keim, P., W. Beavis, J. Schupp, and R. Freestone. 1992. Evaluation of soybean RFLP marker diversity in adapted germplasm. Theor. Appl. Genet. 85:205-212. Lorenzen, L.L., S. Boutin, N.Young, J.E. Specht, and R.C. Shoemaker, 1995. Soybean 13 pedigree analysis using map-based molecular markers; I. Tracking RFLP markers in 14 cultivars. Crop Sci. 35:1326-1336. 15 16 17 18 Negash, A., A. Tsegaye, R.van Treuren, and L. Visser. 2002. AFLP analysis of enset clonal diversity in south and southwestern Ethiopia for conservation. Nei, M. 1987. Molecular evolutionary genetics. Columbia University Press, New York, NY. 19 Palmer, R.G., T. Hymowitz, and R.L. Nelson. 1995. Germplasm diversity within soybean. p. 1- 20 35. In D.P.S. Verma and R.C. Shoemaker (ed.) Soybean genetics, molecular biology and 21 biotechnology. Commonwealth Agricultural Bureaux Intn. Wallingford, Oxon., U.K. 22 Powell, W., M. Morgante, C. Andre, M. Hanafey, J. Vogel, S. Tingey, and A. Rafalski. 21 1 1996. The comparison of RFLP, RAPD, AFLP, and SSR (microsatellite) markers for 2 germplasm analysis. Mol. Breed. 2:225-238. 3 4 5 6 7 8 9 10 11 Sambrook, J., E.F. Fritsch, and T. Maniatis. 1989. Molecular Cloning: A laboratory manual. Cold String Harbor Press. Cold Spring harbor, New York. SAS Institute. SAS/STAT user’s guide. http://statwebserver.stat.iastate.edu/SASOnlineDocV8/sasdoc/sashtml/applet.htm Shoemaker, R.C. and J.E. Specht. 1995. Integation of the soybean molecular and classical genetic linkage groups. Crop Sci. 35:436-446. Singh, R.J., and T. Hymowitz. 1999. Soybean genetic resources and crop improvement. Genome 42:605-616. Smartt, J. and T. Hymowitz. 1985. Domestication and evolution of grain legumes. P.37-72. in 12 R.J. Summerfield and Z.H. Robertsled. Grain Legumes. Collins, London. 13 Thompson, J.A., R.L. Nelson, and L.O. Vodkin. 1998. Identification of diverse soybean 14 15 16 17 18 19 20 21 22 germplasm using RAPD markers. Crop Sci. 38:1348-1355. Thompson, J.A., and R.L. Nelson. 1998. Core set of probes to evaluate genetic diversity in soybean. Crop Sci. 38:1356-1362. Vosman, B.P. Arens, I. Rus, and R. Smulders. 1999. The use of molecular markers for the characterization of tomato cultivars and related Lycopersicon species. pp. 417-423. In M. Nee, D.E. Symon, R.N. Lester, and J.P. Jessop (eds.) Solanaceae IV. Royal Botanic Gardens, UK. Weir, B.S. 1996. Genetic Data Analysis II: Methods for discrete population genetic data. Sinauer Associates. Sunderland, Mass. 22 1 Yeh, F.C., R.C. Yang, and T. Boyle. 1999. http://www.ualberta.ca/~fyeh 23 LIST OF FIGURES Fig. 1. Chinese Map. Light orange area is the selected central province; light green area is the selected southern province. Fig. 2. Side-by-Side boxplots among six populations. CCG = Central Chinese Germplasm. NAC = North American Cultivars. SCG = Southern Chinese Germplasm. C+N = Combined CCG and NAC. C+S = Combined CCG and SCG. N+S = Combined NAC and SCG. Fig. 3. Scree plot of stress against dimensions of configurations attempted. Fig.4. Principal Coordinate Analysis. Plot in two dimensions based on the distance matrix among 302 soybean lines obtained from multidimensional scaling analysis. C=CCG (central Chinese germplasm), N=NAC (North American cultivars), and S=SCG (southern Chinese germplasm). There is a SCG outlier (PI 588040). Fig. 5. Dendrogram resulting from cluster analysis for the RFLP-based mean genetic distances among 15 provinces and North American cultivars (NAC). Fig. 6. Dendrogram resulting from cluster analysis for the RFLP-based mean genetic distances among maturity group within provinces. 24 Fig. 7. Dendrogram resulting from cluster analysis for the RFLP-based mean genetic distances among maturity groups of all cultivars. MG†: maturity group. N‡: number of accessions. 25 26 70 80 90 100 110 120 130 50 50 Heilongjiang Jilin Xingjiang 40 BeijingLiaoning 30 Xizang 20 90 Nei Mongol Yellow River Hebei Ningxia Shanxi Qinghai Shandong Gansu Jiangsu Shaanxi Henan Shanghai Sichuan Anhui Hubei Zhejiang Jiangxi Yangtze River Hunan Fujian Guizhou Yunnan Guangdong Guangxi 100 110 120 40 30 20 27 28 29 1.50 1.25 1.00 0.75 0.50 0.25 0.00 30 3.0 2.5 2.0 1.5 1.0 0.5 0.0 31 1.50 1.25 1.00 0.75 0.50 0.25 0.00 MG† I 0 III II N‡ 5 4 64 48 50 V IV 6 17 00 17 VI VII 21 VIII 71 32 Table 1. Major Chinese soybean ancestral lines selected for diversity analysis. PI Number Ancestral Line Region Province PI548497A PI548493 PI561354 PI458506 PI578503 PI602502 PI602497 PI464916 PI458510 PI602498 PI458505 PI464917 PI297505 Jin yuan Huang bao zhu Zi hua No. 4 Feng di huang Tie jia si li huang Ziong yue xiao huang dou Ke shan si li jia Ji ti No. 2 Ji ti No. 1 Xiao jin huang Da bai mei Ji ti No. 3 Ji ti No. 5 NE† NE NE NE NE NE NE NE NE NE NE NE NE Liaoning Jilin Heilongjiang Jilin Jilin Liaoning Heilongjiang Liaoning Liaoning Jilin Liaoning Jilin Heilongjiang PI430595 58-161 HHH‡ Jiangsu PI602501 PI468408A PI578498B PI567604A Tong shan tian e dan Qi huang No. 1 Ju xuan 23 Xin huang dou HHH HHH HHH HHH Jiangsu Shandong Shandong Shandong PI602499 PI602991 PI602993 PI602992 PI578491A PI578495 Tie jiao huang Shandong si jiao qi Pi xian ruan tiao zhi Qin yang shui bai dou Hua xian da lu dou Jin dou No. 4 HHH HHH HHH HHH HHH HHH Shandong Shandong Jiangsu Henan Henan Shanxi PI578488B PI464932 PI436562 PI578499A PI602994 PI32454 PI430620 PI578504 Feng xian sui dao huang Nan nong 493-1 Ai jiao zao Shanghai liu yue bai Pu dong da huang dou Tai xing hei dou Hou zi mao Xiang dou No. 3 South South South South South South South South Shanghai Jiangsu Hubei Shanghai Shanghai Jiangsu Hubei Hunan † Northeast region in China ‡ Huang Huai Hai region in east central China 33 Table 2. Diversity estimates. Overall = all the populations. CCG = central Chinese germplasms. NAC = North American cultivars. SCG = southern Chinese germplasm. C+N = Combined CCG and NAC. C+S = Combined CCG and SCG. N+S = Combined NAC and SCG. P̂ = Proportion of polymorphism loci. Locus/probe Overall A023 A063-2 A085 A086 A095 A186 A333 A374 A381 A401 A461 A481 A505 A520 A567 A586 A668 A681 A691 A702 A708 A806 A816 A847 B039 B122 B164 B166 K002 K003 K007 K069-1 K070 0.50 0.59 0.51 0.44 0.41 0.33 0.51 0.49 0.50 0.50 0.37 0.40 0.49 0.14 0.49 0.50 0.43 0.66 0.52 0.56 0.43 0.56 0.52 0.38 0.42 0.39 0.50 0.65 0.37 0.50 0.35 0.49 0.56 (Hˆ ) P̂ ‡Mean 0.47 0.02 1.00 CCG NAC 0.08 0.03† 0.47 0.52 0.04 0.34 0.56 0.02 0.49 0.50 0.01 0.38 0.48 0.02 0.40 0.31 0.05 0.31 0.51 0.02 0.04 0.53 0.02 0.09 0.29 0.05 0.36 0.50 0.00 0.47 0.45 0.03 0.47 0.50 0.02 0.09 0.49 0.01 0.47 0.14 0.04 0.15 0.47 0.02 0.30 0.42 0.03 0.49 0.42 0.04 0.34 0.50 0.04 0.45 0.48 0.02 0.54 0.50 0.02 0.50 0.49 0.04 0.53 0.32 0.05 0.32 0.42 0.04 0.24 0.42 0.04 0.49 0.50 0.01 0.42 0.26 0.05 0.49 0.17 0.05 0.48 0.44 0.06 0.38 0.39 0.04 0.46 0.49 0.00 0.39 0.33 0.05 0.60 0.48 0.02 0.26 0.33 0.05 0.12 0.41 0.02 1.00 0.37 0.03 0.06 0.01 0.05 0.05 0.06 0.03 0.05 0.05 0.03 0.03 0.05 0.03 0.05 0.06 0.02 0.06 0.05 0.04 0.06 0.04 0.06 0.06 0.02 0.04 0.01 0.02 0.07 0.03 0.00 0.03 0.06 0.05 0.03 1.00 SCG 0.03 0.38 0.44 0.37 0.32 0.35 0.33 0.48 0.40 0.41 0.18 0.41 0.38 0.13 0.14 0.53 0.07 0.53 0.52 0.49 0.31 0.22 0.63 0.14 0.20 0.38 0.13 0.74 0.11 0.46 0.12 0.49 0.40 0.34 0.02 0.04 0.03 0.04 0.04 0.00 0.04 0.02 0.03 0.03 0.04 0.04 0.04 0.04 0.04 0.02 0.03 0.03 0.02 0.05 0.01 0.04 0.02 0.04 0.04 0.04 0.04 0.01 0.04 0.00 0.04 0.01 0.05 0.03 1.00 C+N C+S N+S 0.29 0.64 0.55 0.48 0.45 0.31 0.42 0.49 0.32 0.49 0.46 0.40 0.48 0.14 0.42 0.46 0.51 0.49 0.50 0.54 0.51 0.33 0.36 0.49 0.49 0.39 0.35 0.42 0.48 0.49 0.48 0.49 0.26 0.49 0.63 0.50 0.45 0.41 0.33 0.47 0.52 0.50 0.48 0.33 0.46 0.47 0.13 0.43 0.50 0.44 0.66 0.52 0.58 0.40 0.57 0.57 0.25 0.42 0.33 0.49 0.69 0.25 0.48 0.22 0.50 0.61 0.33 0.37 0.48 0.37 0.35 0.33 0.49 0.40 0.49 0.48 0.31 0.31 0.47 0.14 0.44 0.52 0.17 0.66 0.54 0.50 0.38 0.46 0.56 0.37 0.29 0.43 0.37 0.70 0.36 0.50 0.36 0.44 0.56 0.44 0.02 1.00 0.46 0.02 1.00 0.42 0.02 1.00 34 1 ^ ^ †: The standard error of h at each locus was obtained by [Vl ( h )] 2 . ‡: The standard error of obtained from the total variance V (H ). 35 Table 3. Estimates of diversities for 33 polymorphic loci for eight subgroups of central Chinese germplasm (CCG) and seven subgroups of southern Chinese germplasm (SCG). P̂ = proportion of polymorphic loci. CCG Locus SCG Anhui Gansu Hebei Henan Jiangsu Shaanxi Shandong Shanxi Anhui Guangdong Hubei Jiangsu Shanghai Sichuan Zhejiang A023 0.00 0.00 0.00 0.22 0.00 0.12 0.44 0.07 0.00 0.00 0.00 0.00 0.00 0.10 0.00 A063 0.65 0.67 0.00 0.65 0.47 0.20 0.50 0.32 0.50 0.41 0.35 0.39 0.44 0.27 0.31 A085 0.00 0.48 0.67 0.44 0.35 0.43 0.44 0.63 0.32 0.24 0.48 0.37 0.49 0.48 0.24 A086 0.00 0.50 0.44 0.49 0.48 0.28 0.44 0.49 0.15 0.49 0.06 0.50 0.50 0.47 0.17 A095 0.24 0.49 0.00 0.50 0.18 0.47 0.00 0.48 0.49 0.00 0.22 0.19 0.00 0.49 0.24 A186 0.41 0.36 0.00 0.42 0.32 0.34 0.44 0.08 0.15 0.24 0.27 0.34 0.44 0.40 0.44 A333 0.24 0.39 0.44 0.41 0.22 0.55 0.44 0.49 0.15 0.24 0.36 0.21 0.28 0.50 0.24 A374 0.41 0.54 0.44 0.56 0.34 0.49 0.44 0.42 0.28 0.00 0.50 0.46 0.49 0.10 0.48 A381 0.41 0.00 0.00 0.46 0.48 0.10 0.44 0.04 0.41 0.13 0.31 0.27 0.15 0.44 0.31 A401 0.24 0.46 0.44 0.35 0.49 0.49 0.44 0.40 0.22 0.22 0.50 0.39 0.00 0.49 0.09 A461 0.41 0.34 0.44 0.43 0.44 0.20 0.44 0.47 0.28 0.00 0.22 0.10 0.00 0.10 0.24 A481 0.13 0.52 0.44 0.40 0.26 0.50 0.44 0.49 0.31 0.24 0.53 0.13 0.32 0.35 0.33 A505 0.00 0.41 0.44 0.49 0.49 0.39 0.00 0.47 0.46 0.44 0.42 0.12 0.28 0.48 0.31 A520 0.00 0.16 0.00 0.18 0.35 0.23 0.00 0.00 0.00 0.24 0.12 0.07 0.00 0.31 0.00 A567 0.24 0.50 0.44 0.38 0.20 0.44 0.00 0.49 0.15 0.00 0.12 0.07 0.28 0.30 0.00 A586 0.24 0.42 0.50 0.48 0.44 0.38 0.44 0.35 0.44 0.45 0.51 0.42 0.61 0.48 0.50 A668 0.49 0.45 0.00 0.22 0.38 0.29 0.44 0.45 0.00 0.00 0.03 0.07 0.00 0.19 0.05 A681 0.24 0.57 0.44 0.37 0.50 0.21 0.00 0.40 0.29 0.64 0.49 0.53 0.40 0.44 0.50 A691 0.41 0.39 0.44 0.49 0.24 0.47 0.44 0.49 0.22 0.28 0.51 0.35 0.28 0.47 0.45 A702 0.24 0.29 0.00 0.62 0.59 0.37 0.00 0.51 0.15 0.41 0.46 0.50 0.50 0.39 0.47 A708 0.00 0.34 0.00 0.46 0.32 0.00 0.00 0.50 0.00 0.00 0.06 0.50 0.00 0.00 0.47 A806 0.00 0.48 0.44 0.00 0.18 0.50 0.00 0.22 0.32 0.22 0.23 0.11 0.28 0.40 0.00 A816 0.00 0.56 0.50 0.32 0.48 0.00 0.00 0.16 0.54 0.24 0.41 0.30 0.50 0.62 0.31 A847 0.00 0.44 0.44 0.00 0.00 0.10 0.00 0.38 0.00 0.24 0.06 0.00 0.00 0.43 0.09 B039 0.00 0.49 0.44 0.42 0.48 0.48 0.00 0.50 0.31 0.24 0.17 0.29 0.00 0.00 0.17 B122 0.24 0.44 0.44 0.00 0.35 0.22 0.00 0.08 0.15 0.00 0.23 0.41 0.28 0.43 0.24 B164 0.50 0.16 0.00 0.00 0.18 0.18 0.00 0.00 0.32 0.41 0.00 0.14 0.28 0.10 0.00 B166 0.41 0.45 0.00 0.66 0.65 0.48 0.44 0.11 0.75 0.41 0.69 0.68 0.60 0.57 0.73 K002 0.00 0.30 0.44 0.48 0.44 0.19 0.00 0.36 0.15 0.38 0.00 0.12 0.28 0.10 0.00 K003 0.41 0.50 0.44 0.42 0.32 0.23 0.44 0.38 0.30 0.50 0.50 0.50 0.32 0.20 0.48 K007 0.00 0.30 0.44 0.42 0.44 0.23 0.00 0.38 0.08 0.22 0.27 0.07 0.00 0.00 0.00 K069 0.44 0.47 0.44 0.38 0.50 0.50 0.00 0.44 0.48 0.00 0.47 0.49 0.44 0.40 0.43 K070 0.00 0.47 0.00 0.40 0.18 0.43 0.00 0.14 0.63 0.00 0.52 0.22 0.00 0.28 0.35 Mean 0.22 0.40 0.29 0.38 0.36 0.32 0.22 0.34 0.27 0.23 0.31 0.28 0.26 0.33 0.26 Std.err 0.03 0.03 0.04 0.03 0.03 0.03 0.04 0.03 0.03 0.03 0.04 0.03 0.04 0.03 0.03 0.61 0.94 0.64 0.88 0.94 0.94 0.48 0.94 0.85 0.70 0.91 0.94 0.67 0.91 0.79 P 36 Table 4. Allele frequencies for each probe-restriction enzyme combination across all 302 lines and for central Chinese germplasm (CCG), the North American cultivars (NAC), and southern Chinese germplasm (SCG). Probe A023 Enzyme EcoRI A063 EcoRI A085 EcoRI A086 HindIII A095 TaqI A186 DraI A333 EcoRI A374 HindIII A381 DraI A401 TaqI A461 TaqI A481 EcoRV A505 EcoRI A520 TaqI A567 HindIII A586 TaqI A668 EcoRI Contined next page Allele A B C A B C A B C A B A B C A B C A B C A B C A B C A B C A B A B C A B A B C A B A B C A B Overall 0.46 0.54 0.00 0.56 0.22 0.22 0.39 0.57 0.03 0.68 0.32 0.72 0.28 0.00 0.80 0.19 0.01 0.47 0.52 0.01 0.61 0.37 0.02 0.56 0.43 0.00 0.44 0.56 0.00 0.76 0.24 0.74 0.23 0.03 0.56 0.44 0.93 0.07 0.00 0.42 0.58 0.59 0.39 0.02 0.70 0.30 Gene Pool CCG 0.96 0.03 0.01 0.18 0.18 0.64 0.39 0.53 0.08 0.54 0.46 0.61 0.39 0.00 0.81 0.17 0.02 0.56 0.42 0.02 0.41 0.55 0.04 0.82 0.16 0.01 0.52 0.48 0.00 0.65 0.35 0.60 0.39 0.02 0.43 0.57 0.93 0.07 0.00 0.63 0.37 0.70 0.30 0.00 0.28 0.71 NAC 0.62 0.38 0.00 0.78 0.22 0.00 0.56 0.44 0.00 0.74 0.26 0.73 0.26 0.01 0.81 0.19 0.00 0.98 0.02 0.00 0.95 0.05 0.00 0.77 0.23 0.00 0.62 0.38 0.00 0.63 0.37 0.95 0.05 0.00 0.38 0.62 0.92 0.08 0.00 0.82 0.18 0.58 0.41 0.00 0.78 0.22 SCG 0.02 0.98 0.00 0.75 0.25 0.00 0.31 0.68 0.01 0.76 0.24 0.80 0.20 0.00 0.78 0.21 0.77 0.20 0.80 0.01 0.62 0.38 0.01 0.27 0.73 0.00 0.28 0.71 0.01 0.90 0.10 0.74 0.20 0.06 0.75 0.25 0.93 0.06 0.01 0.08 0.92 0.52 0.44 0.04 0.96 0.04 37 Table 4. Continued. A681 HindIII A691 TaqI A702 HindIII A708 DraI A806 DraI A816 EcoRV A847 HindIII B039 EcoRI B122 EcoRI B164 DraI B166 EcoRI K002 HindIII K003 EcoRV K007 HindIII K069 EcoRI K070 TaqI C A B C A B C A B C D A B C D A B C A B C A B C A B C A B A B C A B C D E F A B A B A B C D A B A B C 0.00 0.28 0.40 0.31 0.52 0.45 0.03 0.52 0.41 0.07 0.00 0.70 0.29 0.01 0.00 0.50 0.44 0.06 0.63 0.26 0.11 0.75 0.25 0.00 0.70 0.29 0.01 0.26 0.74 0.45 0.54 0.00 0.53 0.22 0.12 0.04 0.07 0.02 0.25 0.75 0.47 0.53 0.18 0.79 0.03 0.00 0.58 0.42 0.57 0.12 0.31 0.01 0.20 0.66 0.13 0.61 0.39 0.00 0.67 0.12 0.20 0.01 0.56 0.44 0.00 0.00 0.81 0.01 0.18 0.72 0.24 0.04 0.70 0.30 0.00 0.46 0.54 0.00 0.15 0.85 0.91 0.08 0.01 0.73 0.08 0.09 0.03 0.01 0.06 0.27 0.74 0.44 0.56 0.21 0.79 0.00 0.00 0.39 0.61 0.79 0.21 0.00 0.00 0.30 0.68 0.02 0.58 0.34 0.08 0.52 0.48 0.00 0.00 0.60 0.34 0.04 0.02 0.80 0.20 0.00 0.86 0.14 0.00 0.43 0.57 0.00 0.70 0.30 0.00 0.45 0.55 0.60 0.40 0.00 0.77 0.12 0.08 0.03 0.00 0.00 0.63 0.37 0.73 0.27 0.41 0.46 0.13 0.00 0.84 0.16 0.94 0.06 0.00 0.00 0.34 0.06 0.60 0.43 0.54 0.03 0.40 0.59 0.01 0.00 0.81 0.19 0.00 0.00 0.12 0.88 0.01 0.46 0.34 0.20 0.92 0.07 0.01 0.89 0.09 0.02 0.25 0.74 0.07 0.93 0.00 0.27 0.38 0.16 0.05 0.15 0.00 0.06 0.94 0.36 0.64 0.04 0.94 0.02 0.01 0.58 0.42 0.18 0.08 0.75 38 Table 5. Relationship of allele number among central Chinese germplasm (CCG), the North American cultivars (NAC), and southern Chinese germplasm (SCG). Present number relative to the null number Number of null alleles CCG NAC SCG CCG Number of alleles 85 12 **** 5 9 NAC 74 23 16 **** 16 SCG 87 10 7 3 **** 39 Table 6. Genetic distance matrix determined by RFLP among central Chinese germplasm (CCG), the North American cultivars (NAC), and southern Chinese germplasm (SCG). GD (X 100) ID CCG NAC SCG CCG NAC **** 88 72 13 **** SCG 75 33 29 **** 40 Table 7. Origin and Maturity Group of NAC lines. NAC lines Origin Maturity Group A1564 China X Japan X Korea I A2943 China X Japan X Korea II A3127 China X Japan III B216 Unknown II Bonus China X Japan IV BSR201 China II Dorman China X Korea V Essex China X Japan V Haberlandt Korea VI Hood China X Japan VI Hutcheson China V Ogden Tokyo X China VI Patoka China IV Peking China IV Pella China X Japan III PI71506 China IV Ransom China X Japan X Korea VII Williams China III York China X Korea V Kingwa China IV Adams China III PI54610-1 China III Clark China IV Corsoy China II Harosoy China II williams China III Perry Japan IV Arksoy Korea VI Tokyo1 Japan VII Roanoke China VII Mandarin China I Manchu China III Mandarin Ottawa China I Richland China II AK Harrow China III Mukden China II Illini China III Dunfield China III CNS China VII Manitoba Brown Unknown 0 Tokyo2 Japan VII PI54610-4 China IV S100 China V Continued next page 41 Table 7. continued Pagoda China 0 Acme China 0 Lincoln China III Capital China 0 Hawkeye China II BlackHawk China I Evans China 0 McCall China 0 Lee China VI Century China II Chippewa China I Ford China III Shelby China III Merit China 0 Kent China X Japan IV Adelphia China X Japan III Wayne China III Amsoy China II Hark China I Hobbit China X Japan X Korea III Beeson China X Japan II Calland China X Japan III