See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/358747267 Profiling of naturally occurring folates in a diverse soybean germplasm by HPLC-MS/MS Article in Food Chemistry · February 2022 DOI: 10.1016/j.foodchem.2022.132520 CITATIONS 0 20 authors, including: Junming Sun Institute of Crop Sciences, Chinese Academy of Agricultural Sciences 67 PUBLICATIONS 541 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Soybean molecular breeding for yield and quality traits View project Soybean molecular breeding for yield and quality traits View project All content following this page was uploaded by Junming Sun on 21 February 2022. The user has requested enhancement of the downloaded file. Journal Pre-proofs Profiling of naturally occurring folates in a diverse soybean germplasm by HPLC-MS/MS Kwadwo Gyapong Agyenim-Boateng, Shengrui Zhang, Md Shariful Islam, Yongzhe Gu, Bin Li, Muhammad Azam, Ahmed M. Abdelghany, Jie Qi, Suprio Ghosh, Abdulwahab S. Shaibu, Berhane Sibhatu Gebregziabher, Yue Feng, Jing Li, Yinghui Li, Chunyi Zhang, Lijuan Qiu, Zhangxiong Liu, Qiuju Liang, Junming Sun PII: DOI: Reference: S0308-8146(22)00482-4 https://doi.org/10.1016/j.foodchem.2022.132520 FOCH 132520 To appear in: Food Chemistry Received Date: Revised Date: Accepted Date: 7 December 2021 27 January 2022 17 February 2022 Please cite this article as: Gyapong Agyenim-Boateng, K., Zhang, S., Shariful Islam, M., Gu, Y., Li, B., Azam, M., Abdelghany, A.M., Qi, J., Ghosh, S., Shaibu, A.S., Sibhatu Gebregziabher, B., Feng, Y., Li, J., Li, Y., Zhang, C., Qiu, L., Liu, Z., Liang, Q., Sun, J., Profiling of naturally occurring folates in a diverse soybean germplasm by HPLC-MS/MS, Food Chemistry (2022), doi: https://doi.org/10.1016/j.foodchem.2022.132520 This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2022 Published by Elsevier Ltd. Profiling of naturally occurring folates in a diverse soybean germplasm by HPLCMS/MS Kwadwo Gyapong Agyenim-Boateng1#, Shengrui Zhang1#, Md Shariful Islam2#, Yongzhe Gu3#, Bin Li1#, Muhammad Azam1, Ahmed M. Abdelghany1,4, Jie Qi1, Suprio Ghosh1,5, Abdulwahab S. Shaibu1,6, Berhane Sibhatu Gebregziabher1,7, Yue Feng1, Jing Li1, Yinghui Li3, Chunyi Zhang2, Lijuan Qiu3, Zhangxiong Liu3*, Qiuju Liang2*, Junming Sun1* 1 The National Engineering Laboratory for Crop Molecular Breeding, MARA Key Laboratory of Soybean Biology (Beijing), Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China 2 Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China 3 The National Key Facility for Crop Gene Resources and Genetic Improvement (NFCRI)/ Key Laboratory of Germplasm and Biotechnology (MARA), Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China 4 Crop Science Department, Faculty of Agriculture, Damanhour University, Damanhour 22516, Egypt 5 Bangladesh Agricultural Research Institute, Gazipur 1701, Bangladesh 6 Department of Agronomy, Bayero University, Kano 700001, Nigeria 7 Crop Sciences Research Department, Mehoni Agricultural Research Center, Maichew 7020, Ethiopia #These authors contributed equally to this work. *Corresponding authors Email: sunjunming@caas.cn, liangquiju@caas.cn, liuzhangxiong@caas.cn Tel. Fax: 0086-10-82105805 Abstract Soybean is a rich source of folates. We optimised the extraction and detection of folates from soybean seeds by HPLC-MS/MS and analysed the folate content and composition of 1074 accessions. Total folate content ranged from 64.51–691.24 μg/100 g fresh weight, with 10fold variation, and 60 elite accessions with over 400 μg/100 g of total folate were identified. The most abundant component was 5-CHO-H4folate, which accounted for an average of 60% of total folate. Seed-coat colour, seed weight, ecoregion, and cultivar type significantly affected soybean folate content. Furthermore, 5-CH3-H4folate correlated positively with seed protein (r = 0.24***) and negatively with oil (r = -0.26***). The geographical distribution of folate according to accession origin revealed that accessions from Northeast China contain higher amounts of total folate and 5-CHO-H4folate. This study provides comprehensive and novel insights into the folate profile of soybean, which will benefit soybean breeding for folate enhancement. Keywords: Soybean (Glycine max L. Merrill); Folate; HPLC-MS/MS; Germplasm; Natural variation; Elite accessions Chemical compounds 10-Formyl-folic acid (PubChem CID: 135405023); 5,10-Methenyl-tetrahydrofolate (PubChem CID: 135398657); 5-Formyl-tetrahydrofolate (PubChem CID: 135403648); 5Methyl-tetrahydrofolate (PubChem CID: 135483998); Dihydrofolate (PubChem CID: 135398604); Folic acid (PubChem CID: 135398658); Tetrahydrofolate (PubChem CID: 135444742); Methotrexate (PubChem CID: 126941); Sodium phosphate monobasic (PubChem CID: 23672064); Sodium phosphate dibasic (PubChem CID: 24203); Sodium ascorbate (PubChem CID: 23667548); β-Mercaptoethanol (PubChem CID: 1567); α-Amylase (PubChem CID: 62698); Acetonitrile (PubChem CID: 6342); Formic acid (PubChem CID: 284) 1. Introduction Folates (vitamin B9) are essential water-soluble vitamins that function as co-enzymes in numerous metabolic processes by mediating one-carbon transfer reactions (Strobbe & Dominique, 2017). Humans cannot synthesise folates de novo and depend entirely on dietary sources. Folate deficiency causes severe health disorders, including neural tube defects, anaemia, cardiovascular disease, and certain cancers (Blancquaert et al., 2010). Due to the severe risks associated with folate deficiency, folic acid fortification has been mandated in certain countries. Notwithstanding the financial challenges and difficulties in public education associated with synthetic folic acid fortification, chronic intake of synthetic folic acid has been linked to adverse health effects. These effects include increased cancer risks, hepatoxicity and masked vitamin B12 deficiencies (Patel & Sobczyńska-Malefora, 2017). Thus, folate biofortification to enhance the natural folate content in crops by metabolic engineering and conventional breeding stands as cost-effective, efficient and promising alternatives. Generally, legumes are considered as rich sources of folates with diversity in variation, making them good candidates for folate improvement. Analysis of the folate content of 29 wild and 4 cultivated lentil accessions revealed that wild and cultivated lentils contained 197.00–497.00 μg/100 g and 174.00–364.00 μg/100 g dry weight (DW) of total folate, respectively (Zhang et al., 2019). A study of an 85-pea germplasm panel showed that folate contents ranged from 14.00–55.00 μg/100 g DW (Jha et al., 2020). The total folate contents of 96 common bean accessions were 113.0–222.00 μg/100 g (Martin, Torkamaneh, & Pauls, 2021). According to studies, soybean total folate content ranges from 202.90–450.00 μg/100 g (Shohag, Wei, & Yang, 2012; Rychlik, Englert, &Kirchhoff, 2007; Ginting & Arcot, 2004; Mo et al., 2013). However, few soybean cultivars were used in these studies and the natural variation of folates in a large soybean population has not been investigated. Soybean is a major economic crop, containing averagely 40% protein, 20% oil, and 15% carbohydrates (Azam et al., 2020). Therefore, studying the folate composition of a large soybean germplasm population and the further selection of elite accessions will be beneficial towards soybean breeding and combatting folate deficiencies. Furthermore, through conventional breeding, the natural variation of folates in a crop population can be exploited to identify QTL or genes for folate biofortification. The objectives of this study were to (i) optimise the extraction protocol for folate monoglutamates in soybean, focusing on the extraction buffer pH, enzyme treatment and boiling time; (ii) investigate the natural variation of folate in a diverse soybean germplasm; (iii) assess the effect of seed coat colour and seed weight on folate content; and (iv) evaluate the association of folate with protein, oil content and geographical factors. The findings of this study would provide essential information for folate improvement in soybean. The elite cultivars identified can be used in food industries and can also be used as donor parents in developing soybean cultivars for folate improvement. 2. Materials and methods 2.1. Materials For folate extraction optimisation, we used Zhonghuang 203 (ZH203), a soybean cultivar developed by the Soybean High-Yield and Quality Breeding Research Group of the Institute of Crop Sciences, Chinese Academy of Agricultural Sciences (CAAS), which was harvested in Autumn 2019. Three technical replicates were used for method development. The soybean germplasm composed of 590 landraces and 382 cultivars and was collected from the three major soybean ecoregions in China, including Northern Region (NR, 187 accessions), Huanghuaihai Region (HR, 252 accessions) and Southern Region (SR, 343 accessions), as well as cultivars from outside China, including East Asia (19 accessions), Russia (15 accessions), North America (15 accessions), South America (9 accessions) and Europe (8 accessions). The accessions were planted in 2018 in Sanya, Hainan province (18°24′ N and 109°5′ E), China. The experimental details, climatic conditions and agronomic practices have been previously reported (Abdelghany et al., 2020). The harvested seeds were air-dried to a low moisture content (4–8%), pulverised (IKA, A10 basic, Rheinische, Germany), and stored at -20°C until folate analysis. Hundred-seed weight was taken after soybean seeds were dried. Seed-coat colours were visually determined as black (102 accessions), brindled (19 accessions), brown (73 accessions), green (86 accessions), purple-red (8 accessions) and yellow (757 accessions). 2.2. Reagents Folate monoglutamates including folic acid (PteGlu; FA), 5-methyltetrahydrofolate (5-CH3H4folate; 5MTHF), tetrahydrofolate (H4folate; THF), 5,10-methenyltetrahydrofolate (5,10CH=H4folate; 5,10MTHF), 5-formyltetrahydrofolate (5-CHO-H4folate; 5FTHF), 10formylfolic acid (10-CHO-PteGlu; 10FFA), dihydrofolate (H2folate; DHF), and Methotrexate (MTX) were purchased from Schirks Laboratories (Jona, Switzerland). A pyrazino-s-triazine, MeFox, the oxidation product of 5-CH3-H4folate, was obtained from Toronto Research Chemicals (Toronto, Canada). Methotrexate (MTX) was used as the internal standard to correct errors during sample preparation and quantification. The purity of all folate standards was over 95%. Sodium phosphate monobasic (NaH2PO4), sodium phosphate dibasic (Na2HPO4), sodium ascorbate, β-mercaptoethanol, α-amylase (from Aspergillus oryzae, ∼ 30 units/mg), and protease (Type XIV, from Streptomyces griseus, ≥ 3.5 units/mg) were sourced from SigmaAldrich (St. Louis, MO, USA). Ultra-pure water was purified on a Heal Force ultra-pure water system (Shanghai, China). Acetonitrile and formic acid (HPLC-MS grade) were purchased from Fisher Scientific (Geel, Belgium). Rat serum, chicken pancreas, and βmercaptoethanol were sourced from Beijing Solarbio Science and Technology Company (Beijing, China). A certified reference material (BCR485) was purchased from the Institute for Reference Materials and Measurements (Geel, Belgium) and was stored at -80°C. 2.3. Stock solutions and folate standards All standard stock solutions (100 g/mL) were prepared under subdued light to prevent photo-oxidation using methanol and 20 mM ammonium acetate (pH6.2) (Riaz et al., 2019), containing 0.5% L-ascorbic acid and 0.5% -mercaptoethanol. 5,10-CH=H4folate was prepared using the same buffer as above at pH4.5. Standard stock solutions were stored in 1 mL aliquots at -80°C. 2.4. Folate extraction Parameters including pH of extraction buffer (pH 4.5, 5.5, 6.5, 7, 7.5, 8.5, and 9), enzyme treatment and dosage (α-amylase, protease, chicken pancreas and rat serum), and boiling time (5, 10, 15 and 20 min) were optimised for the best folate recovery from soybean seeds. Sample extraction was carried out under subdued light to prevent photo-oxidation. However, the initial extraction protocol adopted for our experiment is as follows: Fifty milligrams of fine powder was transferred into a 1.5 mL screw-cap tube (ST-150; Axygen, Union City, CA, USA) and was mixed with 1 mL of 50 mM phosphate buffer (containing 0.5% L-ascorbic acid and 0.2% -mercaptoethanol as antioxidants, and 30 ng/mL MTX as internal standard) on a Miulab multi-tube vortex mixer (Miulab, Hangzhou, China). The mixture was immediately boiled for 10 min in a boiling water bath and cooled on ice for 10 min. Twenty microlitres of -amylase was added and incubated at 37°C with shaking for 30 min. After incubation, the sample was boiled for 5 min to deactivate the enzymes. The sample was then cooled on ice for 10 min. Thereafter, 15 μL of protease was added, and the tube was incubated for 1 h at 37°C. After incubation, the sample was boiled for 5 min to deactivate enzymes. The deconjugation of polyglutamylated tails was carried out by adding 30 μL of rat serum to the sample and incubating for 4 h at 37°C. Subsequently, the sample was boiled for 10 min, cooled on ice for 10 min, and centrifuged twice at 13,000 rpm for 10 min at 4°C. For sample clean-up, 400 µL of the supernatant was taken into a 3KDa MWCO ultra-filtration tube (Millipore, Burlington, MA, USA) and centrifuged at 13,000 rpm for 20 min at 4°C. Finally, 100 μL of the filtrate was taken for analysis, and the remaining was stored at -80°C. Extracts were kept in low-actinic HPLC vials to avoid exposure to light. 2.5. HPLC-MS/MS For folate separation and quantification, all parameters were the same as described by Liang et al. (2020). Folate separation was performed using an Agilent 1260 HPLC system with a Kromasil 100-5 C18 column (Akzo Nobel, Sweden), Agilent SB-C18 pre-column (Agilent Technologies, USA), and an injection volume of 15 μL, which was controlled using the Mass Hunter software. The injector and column oven temperatures were maintained at 4°C and 25°C, respectively. The mobile phases included 0.1% (v/v) formic acid in water (phase A) and 0.1% (v/v) formic acid in acetonitrile (phase B). The gradient program was run for 16.5 min (Supplementary Table S1). Electrospray ionisation in a positive mode was performed using a triple-quadrupole tandem mass spectrometry (Agilent 6420, Palo Alto, CA, USA). Multiple reaction monitoring (MRM) parameters, including the precursor ion, product ion and collision energy were optimised for fragmentation and one major product ion for each folate vitamer was selected for subsequent analysis. 2.6. Method validation For method validation, the sensitivity, linearity, precision, trueness, absolute recovery, and matrix effect (ME) were evaluated. Sensitivity and linearity parameters were determined based on an eight-point calibration curve prepared in a blank soybean matrix (n = 3). For trueness, the folate content of BCR485 was quantified in triplicate. Precision was calculated based on the relative standard deviation (RSD) of the content and peak areas of BCR485 and standards on the same and different days. Intra-day repeatability was assessed in one batch, while inter-day repeatability was evaluated by running samples on three different days (n = 6). The RSD of the retention time of the folate standards relative to the analytes was also calculated to evaluate the stability of the method. Absolute recovery and matrix effect were calculated as described by Matuszewski, Constanzer, & Chavez-Eng (2003). To prepare a blank soybean matrix, 2 g of seed powder of ZH203 was weighed and mixed with 40 mL of 50 mM phosphate buffer (without antioxidants). The mixture was boiled for 1 h at 100°C while being exposed to direct sunlight to degrade endogenous folates and was centrifuged at 13,000 rpm for 30 min at 4°C. Ten per cent activated charcoal was added to the supernatant and was incubated with shaking for 1 h and centrifuged at 13,000 rpm for 30 min. The supernatant was filtered through 3KDa MWCO ultra-filtration tubes, and the filtrate was confirmed for folates below detection limits and was stored at -20°C until use. 2.7. Folate composition of soybean in a wide soybean germplasm The determination of the folate composition in 1074 soybean accessions was carried out using the final optimised protocol and BCR485 was analysed in triplicate with every batch as quality control, with a rejection threshold of 10%. Briefly, 50 mg of fine powder in a 1.5 mL screw-cap tube was mixed with a 1 mL 50 mM phosphate buffer (pH5.5, 0.5% ascorbic acid, 0.2% -mercaptoethanol and 30 ng/mL MTX) using a vortex mixer and was placed in a boiling water bath for 15 min. After cooling on ice for 10 min, 200 μL each of rat serum and chicken pancreas was added to the sample and was incubated for 4 h at 37°C. After incubation, the sample was boiled for 10 min, cooled on ice for 10 min, and centrifuged twice at 13,000 rpm for 10 min at 4°C. For sample clean-up, 400 µL of the supernatant was taken into a 3KDa MWCO ultra-filtration tube and centrifuged at 13,000 rpm for 20 min at 4°C. One hundred microlitres of the filtrate was taken for analysis with HPLC-MS/MS, and the remaining was stored at -80°C. Extracts were kept in low-actinic HPLC vials to avoid exposure to light. Samples were not stored for more than a week. 2.8. Statistical analysis Data were subjected to analysis of variance (ANOVA) with agricolae package (https://cran.rproject.org/web/packages/agricolae) and boxplots with the ggplot2 package (https://www.rdocumentation.org/packages/ggplot2/versions/3.3.5) in R 3.4.5 (R Foundation for Statistical Computing, Vienna, Austria). Post-hoc mean separation was done using Tukey’s HSD at (P < 0.05). Figures for method development were produced using GraphPad Prism version 9.00 for Windows. Accessions were grouped into accession types (cultivar and landrace), ecoregions (NR, HR, and SR), and seed morphological traits (seed coat colour and seed weight) to evaluate their effect on folate content via ANOVA. Pearson’s correlation analysis between folates from this study and quality traits (protein and oil) from the same soybean accessions in our previous study (Azam et al., 2021) was conducted and visualised using the corrplot package (https://www.rdocumentation.org/packages/corrplot/versions/0.92) in R. The geographical distribution maps of soybean seed folates were drawn using ArcGIS 10.0 (ESRI, Redlands, CA, USA, http://destktop.arcgis.com/en/arcmap/) using ordinary kriging interpolation. Folate concentrations were calculated as μg/100 g based on fresh weight (FW). The sum of individual folates, excluding MeFox, was calculated as total folate. 3. Results and Discussion 3.1. Optimisation of folate extraction from soybean seeds To enhance the efficiency of our method, folate monoglutamate extraction from soybean seeds was optimised by considering three major factors, the pH value of extraction buffer, enzyme treatment and dosage, and boiling time. The stability of certain folate vitamers is greatly affected by pH and it is sample-dependent (De Brouwer et al., 2007). In this study, the effects of seven pH levels of the extraction buffer (4.5, 5.5, 6.5, 7, 7.5, 8.5, and 9) on folate monoglutamate recovery were evaluated. Folate monoglutamate extraction includes enzymatic treatments that require complex and timeconsuming pH adjustments, which may lead to oxidation. Therefore, we investigated the stability of folate monoglutamate vitamers from homogenisation. Total folate differed significantly (P < 0.001) among the pH treatments (Fig. 1A). Due to the variability in the stability of folate vitamers, having an optimal pH for all vitamers for extraction is very difficult. The highest total folate content (144.32 μg/100 g FW) was observed at pH5.5, followed by pH4.5 (119.88 μg/100 g FW) and pH6.5 (116.13 μg/100 g FW). Remarkably, the stability of folate vitamers, H4folate, 5-CH3-H4folate and 5-CHO-H4folate was highest at pH5.5 (Fig.1A, Supplementary Table S2). These results were consistent with a previous study in mungbean, where the highest folates were extracted at pH4.5-5.5 (Monch & Rychlik, 2012). Therefore, an extraction buffer of pH5.5 was used for folate extraction in this study. Enzymes, including α-amylase, protease, and conjugases are commonly used for folate extraction from different food matrices. In this experiment, different combinations and dosages of enzymes were evaluated (Fig. 1B, Supplementary Table S3). It was observed that folate yield increased with increased amounts of rat serum. Using 30 μL rat serum yielded 113.58 μg/100 g total folate, whereas 100 μL of rat serum yielded 198 μg/100 g total folate. The single-use of 100 μL chicken pancreas yielded 129.58 μg/100 g of total folate. The three or four-enzyme treatments resulted in folate yield from 155.80–200.95 μg/100 g but were time-consuming and may not be ideal, especially for a larger sample size. Moreover, increasing protease volumes increases background noise and affects quantification, consistent with other studies (Zhang et al., 2005). On the other hand, the combination of higher amounts of rat serum and chicken pancreas (200 μL of each) resulted in an optimal folate yield (220.54 μg/100 g). Consistently, higher doses of conjugases have been reported to improve deconjugation and folate extraction efficiency from black bean (Ramos-Parra, Urrea-López, & de la Garza, 2013). The current study revealed that a two-step extraction process involving heating and deconjugation was sufficient for the optimal release of folates. This was consistent with a previous study in chickpea, where higher amounts of chicken pancreas and rat serum resulted in similar total folate amounts (407 μg/100 g) as compared to the combination of four enzymes (422 μg/100 g) (Zhang et al., 2018). Therefore, for further experiments in our study, the two-step extraction protocol including boiling and the combined treatment with 200 μL of rat serum and 200 μL chicken pancreas was used. Heating samples at 100°C aids in cell lysis and inactivates endogenous enzymes, inducing greater folate release from the matrix and preventing further folate conversion (Zhang et al., 2005). The effect of boiling at 100°C for 5, 10, 15, and 20 min was investigated in this study, and the total folate contents observed were 201.06, 224.19, 290 and 275.49 μg/100 g, respectively (Fig. 1C, Supplementary Table S4). As shown in Fig.1C, total folate content was the highest at 15 min. However, folate content decreased at 20 min, which may have resulted from folate degradation induced by long-time heating. This observation follows earlier reports that boiling at 100°C for 15 min and 12 min was optimal for folate recovery from food matrices and spinach, respectively (Czarnowska-Kujawska, Gujska, & Michalak, 2017; Shohag et al., 2017). Hence, boiling at 100°C for 15 min was adopted for subsequent experiments in this study. 3.2. Method validation The LOD, LOQ, and linearity were determined for each folate standard and MTX from a multi-point calibration curve prepared in a blank soybean matrix in triplicate. The correlation coefficients (R2) for all folates were > 0.99, indicating good linearity of the massspectrometric response within the concentration ranges (Table 1). The MRM transition of seven folate vitamers, MeFox and internal standard MTX is shown in Supplementary Fig. S1. For matrix effects, folate standards and MTX (10 ng/mL) were added to the extraction buffer and blank soybean matrix. Matrix effect ranged from 81.86–104.87% (Supplementary Table S5). The intra-day and inter-day precision of all folates were within the acceptable ranges of 3.81–7.01% and 6.26–11.09% RSD, respectively (Supplementary Table S6). RSDs of retention times ranged from 0.06–0.33%, indicating run-to-run precision and robustness of the method. Similarly, we observed no different peak shapes between the analytes and the standards, indicating no co-eluting compounds with the target analyte. The absolute recoveries of each folate and MTX are listed in Supplementary Table S5. The major folate vitamers showed acceptable absolute recoveries (70.26–126.06%). The low recoveries of H2folate (11.43–19.90%) and 5,10-CH=H4folate (44.22–48.51%) were caused by their instability to heat and pH conditions. H2folate is labile under heat, and its recovery is always low during folate extraction. Despite this, our method provided a relatively higher recovery for this minor component than the absolute recovery of H2folate in lentils (4.00– 13.00%) (Zhang et al., 2019). The recovery of 5,10-CH=H4folate found in this study may indicate conversions from 5,10-CH=H4folate to 5-CHO-H4folate. Moreover, the lower recoveries of these two vitamers do not significantly affect major components and total folate content because they are minor components accounting for lesser than 3% of total folate in soybean. Nevertheless, pH plays much significance in folate quantification and must be critically evaluated. Absolute recovery of H4folate ranged between 35.27–38.48%. The low absolute recovery of this folate is due to its high lability to pH and heating conditions (Strandler et al., 2015). However, the changes of H4folate in a sample matrix may be different from a standard solution, as observed in our pH and boiling time experiments (Supplementary Table S2 and S4). This indicates that the combined use of thiols and ascorbic acid may reduce H4folate degradation in a soybean matrix. Moreso, a similar pH value of 6 has been reported to improve H4folate stability in plant and food matrix (Zhang et al., 2005; Loznjak et al., 2019). Precision and all other validation parameters for H4folate were within the acceptable ranges. For trueness, the certified reference material, BCR485, was analysed and compared to its certified value and results from other studies (Supplementary Table S7). In our study, the total folate content of BCR485 (350.08 μg/100 g) was slightly higher than the certified total folate content (315±28 μg/100 g) using the microbiological assay (MA) (Finglas et al., 1998). Recent studies using HPLC-MS/MS have also reported higher total folate contents (336–375 μg/100 g) than the certified value (Vishnumohan, Arcot, & Pickford, 2011; Ringling & Rychlik, 2017), which is similar to our results. In this study, 5-CH3-H4folate was higher than the indicative value (non-certified value) and close to reported values using HPLC-MS/MS. However, the proportion of 5-CH3-H4folate (~ 90%) in total folate was consistent with previous studies. The discrepancies between MA and HPLC methods and the differences in sample pre-treatment methods may be responsible for the varying results in the reference material. Additionally, the content of other folate vitamers, H4folate, 5-CHO-H4folate, 10CHO-PteGlu, H2folate, PteGlu, 5,10-CH=H4folate, and MeFox were also determined in this study. 3.3. Folate profiling and vitamer distribution among 1074 soybean accessions The final optimised folate method was applied to determine the folate composition of 1074 diverse soybean accessions. Total folate levels ranged from 64.51–691.24 g/100 g, with an average of 262.01 g/100 g (Supplementary Table S8). The wide natural variation in folates in this soybean panel can be utilised to map QTL and identify candidate genes for folate accumulation. The average total folate in this study falls within the folate contents previously reported in soybean between 188–450 μg/100 g (Shohag et al., 2012; Mo et al., 2013). Consequently, higher diversity was discovered and 60 folate-rich soybean accessions with > 400 g/100 g were identified, of which four accessions (ZDD14672, ZDD12910, ZDD12830 and ZDD14683) contained > 600 g/100 g of total folate (Supplementary Table S9). The soybean accessions rich in folates could be used as genetic material for folate breeding programs or serve as good dietary sources. Compared with other crops, the highest folate levels obtained in this study show that total folate contents of soybean are many folds higher than that of potato, wheat and pea (Riaz et al., 2019; Jha et al., 2015; Jha et al., 2020). Thus, a 100 g serving of soybeans could provide a significant amount of the recommended daily allowance of folates. Seven folate monoglutamates were identified in soybean seeds: 5-CHO-H4folate, PteGlu, 5-CH3-H4folate, H4folate, 10-CHO-PteGlu, 5,10-CH=H4folate, and H2folate, with the first five contributing 95–98% of the total folate content (Supplementary Table S8). The most abundant vitamer, 5-CHO-H4folate, contributed 60% of total folate. PteGlu, 5-CH3-H4folate and H4folate accounted for an average of 12%, 10% and 7.6% of total folate, respectively. Meanwhile, 10-CHO-PteGlu accounted for about 6% of total folate, while 5,10-CH=H4folate and H2folate were the least abundant folate vitamers, collectively contributing about 3% of total folate. MeFox, the oxidative product of 5-CH3-H4folate, was also identified in this study, ranging from 110.00–1601.71 g/100g in soybean seeds. Studies on folate vitamer distribution in soybean are scanty. Whereas studies reported 5-CHO-H4folate as the most dominant folate vitamer (Ginting & Arcot, 2004; Shin et al., 1975), other studies have contrastingly reported H4folate as the most dominant (Rychlick et al., 2007; Shohag et al., 2012). The discrepancies in folate vitamer distributions may be caused by storage, cultivar type, environment, and analytical method. H4folate, being one of the most labile vitamers, one would assume will be oxidised to PteGlu at a low pH condition. However, this was not observed in the current study. Moreover, the analytical method used in this study had high specificity and this enabled us to quantify seven folate monoglutamates and MeFox in soybean. Furthermore, 5-CHO-H4folate was determined to be the most dominant vitamer. Therefore, subsequent studies on the effects of location and storage on soybean folate vitamer distribution will be helpful to understand soybean vitamer distribution. 3.4. Soybean folate content varies among accession types and ecoregions To evaluate the variation of folates among accession types, the accessions were grouped into landraces (590 accessions) and improved cultivars (382 accessions). Total folate contents varied significantly (P < 0.05) between the accession types, with landraces having a wider range (64.51–691.24 g/100 g) and a higher mean of 268.99 g/100 g than the improved cultivars (77.54–515.93 g/100 g with a mean of 253.60 g/100 g) (Fig. 2). This suggests that past breeding efforts have not focused on folate improvement in soybean. Consistent with our results, cultivated lentils had 174–361 g/100 g of total folate compared to undomesticated lentils (195–497 g/100 g) (Zhang et al., 2019). Past breeding efforts geared towards enhancing crop yield and appearance may have contributed to a decline in nutritional values and genetic diversity of modern cultivars. A recent study of the Glycine spp. pangenome revealed a significant reduction of the average number of protein-coding genes per individual during domestication and selection (Bayer et al., 2021). Soybean landraces, representing a tremendous genetic diversity, are adapted to various environments and climatic conditions, exhibit tolerance to biotic stresses and harbour genes and gene complexes for quality traits. Previous studies from our research group have also shown that soybean seed quality traits, including isoflavone, fatty acids, and tocopherols are also influenced by accession type (Azam et al., 2020; Abdelghany et al., 2020; Ghosh et al., 2021). Landraces are valuable germplasm to broaden the genetic base of modern soybean cultivars. Thus, the introgression of genes from soybean landraces in breeding programs will help enhance the nutritional value of soybean. Total folate (P < 0.05) and individual vitamers (P < 0.001) differed significantly among the ecoregions (NR, HR and SR), except 10-CHO-PteGlu (P > 0.05) (Fig. 3). Total folate was highest in NR (277.00 g/100 g), followed by SR (267.85 g/100 g) and HR (255.54 g/100 g). The concentration of 5-CHO-H4folate was highest in the NR (181.18 g/100 g). As shown earlier, 5-CHO-H4folate accounted for over 60 % of total folate in this study, which explains why total folate was the highest in NR. Among cultivars, the first eight accessions with the highest folates were from NR whereas, among the landraces with the highest folates, 70% were from the SR ecoregion. This showed a tendency for ecoregion to affect folate levels, indicating a genetic component of variation. 3.5. Association of folates with protein and oil The analysis of the correlation between quality traits is essential in breeding and selection programs. Therefore, we analysed the correlation between folates, protein and oil contents (Fig. 4). Generally, significant positive correlations existed within folates. The highest correlation was observed between 5-CHO-H4folate and total folate (r = 0.90***), whereas the lowest positive correlation was between H4folate and 10-CHO-PteGlu (r = 0.07*). The high correlation of 5-CHO-H4folate with total folate may be due to the high abundance of this vitamer in soybean. However, 5-CHO-H4folate and 5,10-CH=H4folate had a low negative correlation (r = -0.11***), which may have been caused by the interconversion relationship between these two vitamers. The folate vitamer, 5-CH3-H4folate had a positive correlation with protein (r = 0.24***) but was negatively correlated with oil (r = -0.26***). Soybean seed proteins have a significant negative correlation with oil. In this study, 5-CH3-H4folate negatively correlated with oil, but the contrary trend was observed with protein. Folates are protected from photooxidation by binding to proteins (folate binding proteins; FBPs) (Liang et al., 2013) and the photo-oxidation of folates induces protein damage (Wusigale et al., 2021). In Arabidopsis, high folate accumulation was associated with the up-regulation of FBPs (Puthusseri et al., 2018), and the transgenic expression of bovine FBP increased folate content in rice (Blancquaert et al., 2015). As already known, folates are involved in many key metabolic functions, including amino acid synthesis, and thus can explain to a higher degree the positive correlations identified in this study. However, further studies will be necessary to confirm this relationship in soybean. 3.6. Effects of seed morphology and agronomic traits on folate content The relationship between seed morphological traits and nutritional content will help in seed selection and soybean genetic improvement. However, the effect of these traits on soybean folates is unknown. A previous study in pulses reported that seed morphological traits did not significantly correlate with folate content (Jha et al., 2015). In this study, we evaluated the effect of seed weight and seed-coat colour on soybean folate monoglutamates. We observed highly significant differences for all folates due to variations in seed-coat colour. The blackseeded soybean contained the highest amounts of total folate, 5-CHO-H4folate and other folate vitamers (Supplementary Fig. S2). Studies have shown that the black-seeded soybean contains higher levels of anthocyanins and valuable nutrients with great antioxidant and carcinogenic properties (Slavin et al., 2009). Soybean seed coat colour is controlled by multiple loci, with most of these loci involved in flavonoid-based pigmentation pathways. It has been reported that the chalcone synthase gene, which catalyses the first step of the flavone and flavonol branch of phenylpropanoid biosynthesis was up-regulated more than two-fold in transgenic high-folate tomato (Waller et al., 2010). This suggests a possible crosstalk between folates and the flavonoid pathway. The current study is the first to study the association of folates with seed coat colour, and thus, further studies are needed to confirm this in soybean. To study the effect of seed weight on folates, the accessions in this study were grouped into five categories based on 100-seed weight, namely: A (<10.0 g), B (≥ 10 g and < 15 g), C (≥ 15 g and < 20 g), D (≥ 20 g and < 25 g), and E (≥ 25 g). Total folate and most individual vitamer levels were significantly (P < 0.001) affected by seed weight (Supplementary Fig. S3). Accessions with seed weight in category A had the highest amounts of total folate (average, 304.03 g/100 g) and most folate vitamers, indicating that the smaller the seed size, the higher the folate content. In pea, the folate content was higher in the embryo than in the cotyledon (Jabrin et al., 2003). Smaller soybean seeds contain higher proportions of the embryo and may contribute to the higher folate concentrations. However, little is known about the detailed distribution of folates in the soybean seed and thus needs further studies. 3.7. Geographical effects and distribution of folates in soybean in China The geographical distribution of soybean folates, based on their ecoregion is shown in Fig. 5. The correlation between the geographical factors and soybean folates revealed a significant association (Supplementary Table S10). Significant positive correlations were observed between longitude and MeFox, 10-CHO-PteGlu, 5-CHO-H4folate and total folate. On the other hand, longitude had significant negative correlations with other vitamers. There was a negative correlation between latitude with H4folate, 5-CH3-H4folate, H2folate, and PteGlu, while MeFox, 10-CHO-PteGlu, 5-CHO-H4folate, and 5,10-CH=H4folate positively correlated with latitude. Altitude had a positive correlation with H4folate and a negative correlation with 5,10-CH=H4folate, MeFox and 10-CHO-H4folate. The geographical map revealed a distinction between the distribution of total folate, 5CHO-H4folate, and other folate vitamers in China. The highest contents of total folate and 5CHO-H4folate were concentrated in the NR part of China. Conversely, the highest amounts of 5-CH3-H4folate could be seen widely distributed in the three ecoregions with more intensity at the HR followed by the SR, with the lowest contents observed in the NR. 4. Conclusion In conclusion, we conducted a novel study investigating the folate monoglutamate composition of 1074 diverse soybean germplasm using HPLC-MS/MS. First, we optimised the extraction method and found optimal recoveries for the most important folate vitamers at pH5.5, boiling for 15 min and a combined use of 200 L of both rat serum and chicken pancreas. The extraction method was time-saving and cost-effective compared to the traditional tri-enzyme treatment methods and reduced the complications associated with longer extraction times. By profiling the 1074 soybean germplasm, we identified a significantly wide variation among the folate contents, with an over 10-fold variation for total folate content and found elite accessions with total folate contents > 400 g/100 g. Furthermore, we observed that soybean folate monoglutamates are affected by a plethora of factors, including ecoregion, accession type, seed-coat colour, seed size, and geographical factors. Finally, correlation analysis revealed folate had positive and negative relationships with protein and oil, respectively. Overall, this study provides the basis for the understanding of folates in soybean and shows that soybean is a strong candidate for folate biofortification. Supplementary Data Supplementary Fig. S1. MRM transition of seven folate vitamers, MeFox and internal standard (MTX). Supplementary Fig. S2. Folate distribution (μg/100 g FW) among different seed coat colours in soybean. Supplementary Fig. S3. Folate distribution (μg/100 g FW) among different seed weights in soybean. Supplementary Table S1. Gradient program for folate analysis on HPLC MS/MS in soybean seeds. Supplementary Table S2. Folates extracted (μg/100 g FW) after using different pH conditions in soybean seeds. Supplementary Table S3. Folates extracted (μg/100 g FW) after using different enzyme treatments and amounts in soybean seeds. Supplementary Table S4. Folates extracted (μg/100 g FW) using different boiling times in soybean seeds. Supplementary Table S5. Matrix effect and absolute recovery of folate vitamers. Supplementary Table S6. Precision measurements of folates standards and BCR485. Supplementary Table S7. Comparison of the folate values (μg/100 g) of BCR485 detected in our study with previous studies and certified value. Supplementary Table S8. Descriptive statistics of 7 folate monoglutamates, total folate and MeFox among soybean accessions from Hainan 2018 extracted using our optimised extraction protocol. Supplementary Table S9. List of soybean accessions containing > 400 g/100 g FW of total folate. Supplementary Table S10. Correlations between geographical factors and soybean folates. Author contributions K.G.A-B, S.Z, S.I., R.G. and B.L. - Formal analysis, Investigation; Methodology, Software, Writing - original draft, review & editing, Data curation. M.A., A.M.A., A.S., J.Q., S.G, B.S.G., Y.F., J.L., Y.L - Investigation, Methodology. C.Z., L.Q.- Formal analysis, Resources, Writing - review & editing. J.S., Q.L. and Z.L. - Conceptualization, Funding acquisition, Project administration, Supervision, Resources, Writing - review & editing. Conflict of interest The authors declare no conflict of interest. Acknowledgements This work was supported by the Ministry of Science and Technology (2021YFD1201605), National Natural Science Foundation of China (32161143033 and 32001574), and CAAS (Chinese Academy of Agricultural Sciences) Agricultural Science and Technology Innovation Project (2060302-2, CAAS-ZDRW202004 and SWJSZD2020-001). We thank the public laboratory of the Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, for providing us with access to the HPLC and triple-quadrupole MS/MS instruments and for providing technical assistance. Sincere gratitude goes to Dr Benjamin Karikari for his assistance in proofreading this manuscript and to all who contributed to making this manuscript fit for publication. References Abdelghany, A. M., Zhang, S., Azam, M., Shaibu, A. S., Feng, Y., Li, Y., … Sun, J. (2020). Profiling of seed fatty acid composition in 1025 Chinese soybean accessions from diverse ecoregions. The Crop Journal, 8(4), 635–644. https://doi.org/10.1016/j.cj.2019.11.002. Azam, M., Zhang, S., Abdelghany, A. M., Shaibu, A. S., Feng, Y., Li, Y., … Sun, J. (2020). Seed isoflavone profiling of 1168 soybean accessions from major growing ecoregions in China. Food Research International, 130, 108957. https://doi.org/10.1016/j.foodres.2019.108957. Azam, M., Zhang, S., Qi, J., Abdelghany, A. M., Shaibu, A. S., Ghosh, S., …Sun, J. (2021). Profiling and associations of seed nutritional characteristics in Chinese and USA soybean cultivars. Journal of Food Composition and Analysis, 98, 103803. https://doi.org/10.1016/j.jfca.2021.103803. Bayer, P. E., Valliyodan, B., Hu, H., Marsh, J. I., Yuan, Y., Vuong, T. D., … Nguyen, H. T. (2021). Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding. The Plant Genome, e20109. https://doi.org/10.1002/tpg2.20109. Blancquaert, D., Storozhenko, S., Loizeau, K., De Steur, H., De Brouwer, V., Viaene, J., … Van Der Straeten, D. (2010). Folates and folic acid: from fundamental research toward sustainable health. Critical Reviews in Plant Science, 29(1), 14–35. https://doi.org/10.1080/07352680903436283. Blancquaert, D., Van Daele, J., Strobbe, S., Kiekens, F., Storozhenko, S., De Steur, H., … Van Der Straeten, D. (2015). Improving folate (vitamin B9) stability in biofortified rice through metabolic engineering. Nature Biotechnology, 33(10), 1076–1078. https://doi.org/10.1038/nbt.3358. Czarnowska-Kujawska, M., Gujska, E., & Michalak, J. (2017). Testing of different extraction procedures for folate HPLC determination in fresh fruits and vegetables. Journal of Food Composition and Analysis, 57, 6472. https://doi.org/10.1016/j.jfca.2016.12.019. De Brouwer, V., Zhang, G. F., Storozhenko, S., Van Der Straeten, D., & Lambert, W. E. (2007). pH stability of individual folates during critical sample preparation steps in prevision of the analysis of plant folates. Phytochemical Analysis, 18(6), 496–508. https://doi.org/10.1002/pca.1006. Finglas, P. M., Scott, K. J., Witthöft, C. M., van den Berg, H., & de Froidmont-Görtz, I. (1998). The certification of the mass fractions of vitamins in four reference materials: wholemeal flour (CRM 121), milk powder (CRM 421), lyophilised mixed vegetables (CRM 485), and lyophilised pigs liver (CRM 487). EUR Report 18320. Luxembourg: Office for Official Publications of the European Communities. Finglas P. M., Wigertz, K., Vahteristo L., Witthöft C., & de Froidmont-Görtz I. (1999). Standardisation of HPLC techniques for the determination of naturally-occurring folates in food. Food Chemistry, 64(2), 245–255. https://doi.org/10.1016/S03088146(98)00171-X. Ghosh, S., Zhang, S., Azam, M., Qi, J., Abdelghany, A. M., Shaibu, A. S., … Sun, J. (2021). Seed tocopherol assessment and geographical distribution of 1151 Chinese soybean accessions from diverse ecoregions. Journal of Food Composition and Analysis, 100, 103932. https://doi.org/10.1016/j.jfca.2021.103932. Ginting, E., & Arcot, J. (2004). High-performance liquid chromatographic determination of naturally occurring folates during tempe preparation. Journal of Agricultural and Food Chemistry, 52(26), 7752–7758. https://doi.org/10.1021/jf040198x. Jabrin, S., Ravanel, S., Gambonnet, B., Douce, R., & Rébeillé, F. (2003). One-carbon metabolism in plants. Regulation of tetrahydrofolate synthesis during germination and seedling development. Plant Physiology, 131(3), 1431–1439. https://doi.org/10.1104/pp.016915. Jha, A. B., Ashokkumar, K., Diapari, M., Ambrose, S. J., Zhang, H., Tar’an, B., … Purves, R. W. (2015). Genetic diversity of folate profiles in seeds of common bean, lentil, chickpea and pea. Journal of Food Composition and Analysis, 42, 134–140. https://doi.org/10.1016/j.jfca.2015.03.006. Jha, A. B., Gali, K. K., Zhang, H., Purves, R. W., Tar’an, B., Vandenberg, A., & Warkentin, T. D. (2020). Folate profile diversity and associated SNPs using genome wide association study in pea. Euphytica, 216(2), 18. https://doi.org/10.1007/s10681-0202553-8. Liang, L., Jie, Z., Zhou, P., & Subirade, M. (2013). Protective effect of ligand-binding proteins against folic acid loss due to photodecomposition. Food Chemistry, 141(2), 754–761. https://doi.org/10.1016/j.foodchem.2013.03.044. Liang, Q., Wang, K., Shariful, I., Ye, X., & Zhang, C. (2020). Folate content and retention in wheat grains and wheat-based foods: Effects of storage, processing, and cooking methods. Food Chemistry, 333, 127459. https://doi.org/10.1016/j.foodchem.2020.127459. Ložnjak P., García-Salinas C., de la Garza R. I. D., Bysted A., & Jakobsen, J. (2019). The use of a plant enzyme for rapid and sensitive analysis of naturally-occurring folates in food by liquid chromatography-tandem mass spectrometry. Journal of Chromatography A, 1594: 34–44. https://doi.org/10.1016/j.chroma.2019.02.037. Martin, C. J., Torkamaneh, D., Arif, M., & Pauls K. P. (2021). Genome-wide association study of seed folate content in common bean. Frontiers in Plant Science, 12: 696423. https://doi.org/10.3389/fpls.2021.696423. Matuszewski, B., Constanzer, M., & Chavez-Eng, C. (2003). Strategies for the assessment of matrix effect in quantitative bioanalytical methods based on HPLC−MS/MS. Analytical Chemistry, 75(13), 3019–3030. https://doi.org/10.1021/ac020361s. Mo, H., Kariluoto, S., Piironen, V., Zhu, Y., Sanders, M. G., Vincken, J. P., … Nout, M. R. (2013). Effect of soybean processing on content and bioaccessibility of folate, vitamin B12 and isoflavones in tofu and tempe. Food Chemistry, 141(3), 2418–2425. https://doi.org/10.1016/j.foodchem.2013.05.017. Monch, S., & Rychlik, M. (2012). Improved folate extraction and tracing deconjugation efficiency by dual label isotope dilution assays in foods. Journal of Agricultural and Food Chemistry, 60(6), 1363–1372. https://doi.org/10.1021/jf203670g. Patel, K. R., & Sobczyńska-Malefora, A. (2017). The adverse effects of an excessive folic acid intake. European Journal of Clinical Nutrition, 71(2), 159–163. https://doi.org/10.1038/ejcn.2016.194. Puthusseri, B., Divya, P., Veeresh, L., Kumar, G., & Neelwarne, B. (2018). Evaluation of folate-binding proteins and stability of folates in plant foliages. Food Chemistry, 242, 555–559. https://doi.org/10.1016/j.foodchem.2017.09.049. Ramos-Parra, P. A., Urrea-López, R., & de la Garza, R. I. D. (2013). Folate analysis in complex food matrices: Use of a recombinant Arabidopsis γ-glutamyl hydrolase for folate deglutamylation. Food Research International, 54(1), 177–185. https://doi.org/10.1016/j.foodres.2013.06.026. Riaz, B., Liang, Q., Wan, X., Wang, K., Zhang, C., & Ye, X. (2019). Folate content analysis of wheat cultivars developed in the North China Plain. Food Chemistry, 289, 377–383. https://doi.org/10.1016/j.foodchem.2019.03.028. Ringling, C., & Rychlik, M. (2017). Origins of the difference between food folate analysis results obtained by LC–MS/MS and microbiological assays. Analytical and Bioanalytical Chemistry, 409(7), 1815–1825. https://doi.org/10.1007/s00216-0160126-4. Rychlik, M., Englert, K., Kapfer, S., & Kirchhoff, E. (2007). Folate contents of legumes determined by optimized enzyme treatment and stable isotope dilution assays. Journal of Food Composition and Analysis, 20(5), 411–419. https://doi.org/10.1016/j.jfca.2006.10.006. Shin, Y. S., Kim, E. S., Watson, J. E., & Stokstad, E. L. R. (1975). Studies of Folic Acid Compounds in Nature. IV. Folic Acid Compounds in Soybeans and Cow Milk. Canadian Journal of Biochemistry, 53(3), 338–343. https://doi.org/10.1139/o75-047. Shohag, M., Wei, Y., & Yang, X. (2012). Changes of folate and other potential healthpromoting phytochemicals in legume seeds as affected by germination. Journal of Agricultural and Food Chemistry, 60(36), 9137–9143. https://doi.org/10.1021/jf302403t. Shohag, M., Yang, Q., Wei, Y., Zhang, J., Khan, F. Z., Rychlik, M., … Yang, X. (2017). A rapid method for sensitive profiling of folates from plant leaf by ultra-performance liquid chromatography coupled to tandem quadrupole mass spectrometer. Journal of Chromatography B, 1040, 169–179. https://doi.org/10.1016/j.jchromb.2016.11.033. Slavin, M., Kenworthy, W., and Yu, L. (2009). Antioxidant properties, phytochemical composition, and antiproliferative activity of Maryland-grown soybeans with colored seed coats. Journal of Agricultural and Food Chemistry, 57, 11174–11185. https://doi.org/10.1021/jf902609n. Strandler, H. S., Patring, J., Jägerstad, M., & Jastrebova, J. (2015). Challenges in the Determination of Unsubstituted Food Folates: Impact of Stabilities and Conversions on Analytical Results. Journal of Agricultural and Food Chemistry, 63(9), 2367–2377. https://doi.org/10.1021/jf504987n. Strobbe, S., & Dominique, V. D. S. (2017). Folate biofortification in food crops. Current Opinion in Biotechnology, 44, 202–211. https://doi.org/10.1016/j.copbio.2016.12.003. Vishnumohan, S., Arcot, J., & Pickford, R. (2011). Naturally-occurring folates in foods: method development and analysis using liquid chromatography–tandem mass spectrometry (LC–MS/MS). Food Chemistry, 125(2), 736–742. https://doi.org/10.1016/j.foodchem.2010.08.032. Waller, J. C., Akhtar, T. A., Lara-Núñez, A., Gregory III, J. F., McQuinn, R. P., Giovannoni, J. J., & Hanson, A. D. (2010). Developmental and feedforward control of the expression of folate biosynthesis genes in tomato fruit. Molecular Plant, 3(1), 66–77. https://doi.org/10.1093/mp/ssp057. Wusigale, Fu, X., Yin, X., Ji, C., Cheng, H., & Liang L. (2021). Effects of folic acid and caffeic acid on indirect photo-oxidation of proteins and their costabilization under irradiation. Journal of Agricultural and Food Chemistry, 69(42), 12505–12516. https://doi.org/10.1021/acs.jafc.1c02209. Zhang, G. F., Storozhenko, S., Van Der Straeten, D., & Lambert, W. E. (2005). Investigation of the extraction behavior of the main monoglutamate folates from spinach by liquid chromatography–electrospray ionization tandem mass spectrometry. Journal of Chromatography A, 1078(1–2), 59–66. https://doi.org/10.1016/j.chroma.2005.04.085. Zhang, H., Jha, A. B., De Silva, D., Purves, R. W., Warkentin, T. D., & Vandenberg, A. (2019). Improved folate monoglutamate extraction and application to folate quantification from wild lentil seeds by ultra-performance liquid chromatographyselective reaction monitoring mass spectrometry. Journal of Chromatography B, 1121, 39–47. https://doi.org/10.1016/j.jchromb.2019.05.007. Zhang, H., Jha, A. B., Warkentin, T. D., Vandenberg, A., & Purves, R. W. (2018). Folate stability and method optimization for folate extraction from seeds of pulse crops using LC-SRM MS. Journal of Food Composition and Analysis, 71, 44–55. https://doi.org/10.1016/j.jfca.2018.04.008. Figure Legends Fig. 1. Extraction optimisation of folate monoglutamates from ZH203. A. Stability of soybean folate monoglutamates under seven different pH treatments. pH test was conducted using tri-enzyme treatment consisting of 20 μL -amylase, 15 μL protease and 30 μL rat serum. B. Folates extracted from ZH203 under different enzyme treatments and amounts; A30 L rat serum, B- 50 L rat serum, C- 100 L rat serum, D- 150 L rat serum, E- 100 L chicken pancreas, F- 200 L rat serum + 200 L chicken pancreas, G- 20 L -amylase +15 L protease + 30 L rat serum, H- 20 L -amylase +150 L protease + 100 L rat serum, I20 L -amylase +150 L protease + 100 L rat serum +150 L chicken pancreas; C. Folate recovery of ZH203 at different boiling times (n = 3). Error bars represent the standard deviation of folate recovery from triplicate determinations. Total folate content of bars with same lowercase letters are not significantly different (P > 0.05). Fig. 2. Comparison of the folate content (μg/100g FW) between soybean cultivars and landraces. Different lowercase letters indicate statistical differences at P < 0.05. Fig. 3. Folate vitamer distribution (μg/100 g FW) among the three major ecoregions in China; Northern Region (NR), Huanghuaihai Region (HR) and Southern Region (SR). Different lowercase letters indicate statistical differences at P < 0.05. Fig. 4. Correlation between folate vitamers and other seed quality traits. THF, H4folate; 5MTHF, 5-CH3-H4folate; 5,10MTHF, 5,10-CH=H4folate; 10FFA, 10-CHO-PteGlu; 5FTHF, 5-CHO-H4folate; DHF, H2folate. *, **, and *** indicate significant differences at 5%, 1% and 0.1%. Fig. 5. Geographical distribution of 5-CHO-H4folate (5FTHF), 5-CH3-H4folate (5MTHF) and total folate content (μg/100 g FW) across the three major soybean production areas in China. Table 1. Calibration and sensitivity data for folate standards prepared in blank soybean matrix (n = 3) Folate Limit of detection (g/100 g) Limit of quantification (g/100 g) Slope (mean ± SD n = 7 or 8) Correlation coefficient R2 Linear range (g/100 g) Function H4folate 0.098 0.328 1833.97±12.35 0.995 0.328–100 1/x 5-CH3-H4folate 0.207 0.627 3827.72±33.09 0.992 0.627–500 1/x 5,10-CH=H4folate 0.124 0.377 2176.55±34.88 0.997 0.377–100 1/x MeFox 0.085 0.259 1263.45±11.82 0.998 0.259–1800 1/x 10-CHO-PteGlu 0.206 0.623 617.24±5.68 0.992 0.623–100 1/x 5-CHO-H4folate 0.366 1.109 1245.34±6.6 0.994 1.109–500 1/x H2folate 0.080 0.232 176.29±1.49 0.993 0.232–100 1/x PteGlu 0.148 0.447 376.78±1.43 0.991 0.447–200 1/x MTX 0.226 0.685 4454.49±8.56 0.995 0.685–400 1/x Author contributions K.G.A-B, S.Z, S.I., R.G. and B.L. - Formal analysis, Investigation; Methodology, Software, Writing - original draft, review & editing, Data curation. M.A., A.M.A., A.S., J.Q., S.G, B.S.G., Y.F., J.L., Y.L - Investigation, Methodology. C.Z., L.Q.- Formal analysis, Resources, Writing - review & editing. J.S., Q.L. and Z.L. - Conceptualization, Funding acquisition, Project administration, Supervision, Resources, Writing - review & editing. Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Supplementary Figure S1. MRM transition of seven folate vitamers, MeFox and internal standard, MTX. Supplementary Figure S2. Folate distribution (μg/100g FW) among different seed coat colors in soybean. Different lowercase letters indicate statistical differences at P < 0.05. Supplementary Figure S3. Folate distribution (μg/100g FW) among different seed weights in soybean. Different lowercase letters indicate statistical differences at P < 0.05. Supplementary Table S1. Gradient program for folate analysis on HPLC MS/MS in soybean seeds Time (min) Mobile phase A (%) Mobile phase B (%) Flow rate (mL/min) 0.00 95.00 5.00 0.30 2.00 91.00 9.00 0.30 7.90 90.50 9.50 0.30 8.20 80.00 20.00 0.30 11.20 80.00 20.00 0.60 11.40 95.00 5.00 0.60 14.40 95.00 5.00 0.60 14.50 95.00 5.00 0.30 16.50 95.00 5.00 0.30 Supplementary Table S2. Folates extracted (μg/100g FW) after using different pH conditions in soybean seeds pH H4folate 5-CH3H4folate 5,10-CH+H4folate Mefox 10-CHOPteGlu 5-CHOH4folate H2folate PteGlu Total Folates 4.5 4.30±0.16 7.52±0.05 1.74±0.23 84.07±9.18 16.20±1.15 75.97±3.38 1.55±0.05 12.59±0.22 119.88±4.38b 5.5 6.98±0.17 9.03±1.12 1.47±0.32 96.68±3.89 29.67±0.33 78.18±2.24 1.81±0.09 17.18±0.32 144.32±0.41a 6.5 5.59±0.79 7.18±0.99 1.05±0.17 82.8±11.35 31.27±0.70 56.45±2.82 1.43±0.61 13.15±0.13 116.13±5.94b 7 4.09±0.24 5.56±0.16 0.95±0.01 66.65±4.64 25.33±1.25 41.01±0.61 1.38±0.11 9.03±0.34 87.35±0.45d 7.5 4.74±0.25 5.47±0.51 0.90±0.14 67.83±2.91 29.78±1.50 47.49±0.74 1.16±0.04 8.85±0.81 98.40±1.41c 8.5 4.51±0.00 6.02±0.45 1.06±0.03 85.07±5.65 35.24±0.35 45.35±3.47 1.92±0.13 8.92±0.82 103.02±2.84c 9.0 4.10±0.02 5.78±0.59 1.07±0.07 99.54±6.14 36.32±1.62 31.62±1.81 1.72±0.08 9.12±0.07 89.74±2.77d Folate vitamers μg/100g (mean±SD) according to pH treatment Lowercase superscripts at the total folate values indicate significant differences at P < 0.05 Analysis of variance (ANOVA) of total folates by pH treatments Treatment Df Sum Sq Mean Sq F value Pr(>F) 6 4804.50 800.75 77.26 4.855e-06 *** Supplementary Table S3. Folates extracted (μg/100g FW) after using different enzyme treatments and amounts in soybean seeds Enzyme treatment H4folate 5-CH3H4folate 5,10-CH+H4folate Mefox 10-CHOPteGlu 5-CHOH4folate H2folate PteGlu Total Folates 30 RS 9.54±0.32 10.51±1.34 1.09±0.16 64.78±1.51 9.41±0.160 78.63±0.84 0.62±0.10 3.78±0.09 113.58±0.20f 50 RS 12.39±0.04 13.05±1.92 1.37±0.28 85.90±2.77 11.60±0.26 104.45±0.36 0.74±0.12 5.07±0.19 148.66±1.98de 100 RS 13.90±0.60 13.45±0.75 1.79±0.76 82.79±10.8 13.71±0.33 114.62±4.11 0.83±0.14 5.55±0.23 163.85±6.65cd 150 RS 16.29±2.35 15.92±2.18 2.29±1.39 101.74±14.74 16.05±0.49 133.95±5.37 0.91±0.16 5.91±0.44 191.33±21.69b 100 CP 4.50±0.19 8.18±0.90 0.00±0.00 146.87±10.23 6.26±0.02 103.35±8.36 0.00±0.00 7.30±0.00 129.58±9.10ef 200 RS + 200 CP 23.88±0.16 21.15±4.86 2.98±0.17 105.81±4.10 15.97±0.49 149.49±0.25 1.25±0.35 5.83±0.22 220.54±4.75a 20 A, 15 P, 30 RS 9.85±1.54 9.88±2.04 0.00±0.00 64.78±1.51 32.01±7.45 89.95±0.26 1.58±0.01 12.53±3.65 155.8±14.43cde 20 A, 150 P, 100 RS 13.35±1.10 4.42±0.73 1.97±0.58 176.70±36.5 43.49±6.53 92.66±9.70 3.76±1.80 18.98±3.78 178.62±23.01bc 20 A, 150 P, 100 RS, 150 CP 16.25±1.48 7.25±1.23 1.45±0.10 185.98±15.51 52.05±1.97 100.97±7.45 3.41±0.08 19.58±1.91 200.95±10.24ab Folate vitamers μg/100g (mean±SD) according to enzyme treatment Lowercase superscripts at the total folate values indicate significant differences at P < 0.05; RS – Rat serum; CP – Chicken pancreas; A- -amylase; P- Protease. Analysis of variance (ANOVA) of total folates by enzyme treatment Treatment Df Sum Sq Mean Sq F value Pr(>F) 8 19606 2450.71 16.24 4.558e-05 *** Supplementary Table S4. Folates extracted (μg/100g FW) using different boiling times in soybean seeds Boling time H4folate 5-CH3H4folate Mefox 5,10-CH+H4folate 10-CHOPteGlu 5-CHOH4folate H2folate PteGlu Total Folates 5 mins 15.21±0.71 22.27±0.86 152.49±5.34 9.65±0.19 18.68±0.53 111.83±4.97 2.73±0.36 20.70±1.42 201.06±7.60c 10 mins 14.87±0.60 22.82±1.22 165.27±0.90 8.62±0.88 23.26±0.61 127.03±1.61 2.06±0.13 25.52±1.09 224.19±2.74b 15 mins 19.34±1.29 31.67±1.76 220.11±7.11 11.03±2.34 29.51±1.76 167.61±1.02 1.88±0.15 28.95±2.16 290.00±8.44a 20 mins 17.15±0.93 27.58±3.16 201.9±8.19 8.99±0.94 27.79±2.77 164.87±1.33 1.54±0.08 27.58±1.04 275.49±10.09a Folate vitamers μg/100g (mean±SD) according to boiling time Lowercase superscripts at the total folate values indicate significant differences at P < 0.05 Analysis of variance (ANOVA) of total folates by boiling time Treatment Df Sum Sq Mean Sq F value Pr(>F) 3 10580.10 3526.70 59.13 0.0009049 *** Supplementary Table S5. Matrix effect and absolute recovery of folate vitamers Compounds Matrix Effect % Absolute recovery % Low Medium High H4folate 94.97 38.48 37.39 35.27 5-CH3-H4folate 104.87 70.26 71.61 74.22 5,10-CH=H4folate 98.10 45.37 48.51 44.22 MeFox 97.99 97.4 95.85 94.34 10-CHO-PteGlu 90.60 74.85 80.95 80.03 5-CHO-H4folate 91.37 110.67 98.97 126.06 H2folate 81.86 15.18 19.90 11.43 PteGlu 91.40 56.95 57.83 59.17 MTX 93.36 81.44 85.93 75.33 MTX- Internal standard Supplementary Table S6. Precision measurements of folates standards and BCR 485 Folates Intra-day precision %RSD Inter-day precision RSD (%) Intra-day precision (BCR 485) %RSD Inter-day precision (BCR 485) %RSD Retention time %RSD H4folate 4.81 6.26 1.89 6.47 0.23 5-CH3-H4folate 5.79 6.79 3.20 3.03 0.14 5,10-CH=H4folate 3.81 7.95 5.92 7.44 0.21 MeFox 4.45 10.10 4.28 4.01 0.16 10-CHO-PteGlu 4.40 11.09 3.25 4.34 0.09 5-CHO-H4folate 5.36 7.04 4.92 5.57 0.25 H2folate 7.01 8.20 1.69 1.20 0.06 PteGlu 6.65 10.93 7.60 7.70 0.10 4.55 7.53 0.33 MTX 5.95 7.99 MTX: Internal standard; %RSD- Relative standard deviation Supplementary Table S7. Comparison of the folate values (μg/100g) of BCR 485 detected in our study with previous studies and certified value Mean content (Present Study) Shohag et al., 2017 (LC-MS/MS) Ringling et al., 2013 (LC-MS/MS) Vishnumohan et al., 2011 (LC-MS/MS) Finglas et al., 1999 (HPLC) Certified value (MA) Indicative value H4folate 3.66±0.01 28.78±1.86 8.00 ND 5 NA NA 5-CH3-H4folate 332.79±0.05 249.00±11.13 320.90 375 202-294 NA 214.42 5,10-CH=H4folate 1.27±0.13 NA 0.10 NA NA NA NA MeFox 200.28±7.83 NA NA ND NA NA NA 10-CHO-PteGlu 1.17±0.38 NA 1.10 NA ND NA NA 5-CHO-H4folate 9.97±0.15 23.18±1.63 5.00 ND ND NA NA H2folate 0.35±0.10 NA NA NA NA NA NA PteGlu 0.87±0.06 NA 0.80 ND ND NA NA Total folates (Without MeFox) 350.08±0.58 289.30±14.22 336.00 375.00±16.00 NA 315.00±28.00 NA Extraction method (conjugase) Mono-enzyme (CP+ RS) Mono-enzyme (CP+ RS) Mono-enzyme (CP+ RS) Tri-enzyme (Human plasma) Mono-enzyme (Hog kidney) - Mono-enzyme (Hog kidney) MA- Microbiological assay; HPLC- High-Performance liquid chromatography; LC-MS/MS- Liquid Chromatography Mass Spectrometry; NA- not analysed; ND- not detected;”-“data not available Supplementary Table S8. Descriptive statistics of 7 folate monoglutamates, total folates and MeFox of soybean accessions from Hainan 2018 extracted using our optimized extraction protocol Folates Minimum (μg/100g) Maximum (μg/100g) Mean (μg/100g) Standard Deviation Coefficient of variation (%) H4folate 0.68 66.49 20.53 9.92 48.32 5-CH3-H4folate 0.84 205.74 28.23 18.9 66.96 5,10-CH=H4folate 0.43 28.46 5.12 3.00 58.56 MeFox 110.00 1601.71 407.02 154.37 37.93 10-CHO-PteGlu 1.00 71.06 11.45 6.15 53.66 5-CHO-H4folate 33.02 590.59 162.25 64.65 39.84 H2folate 0.25 29.44 2.90 2.68 92.37 PteGlu 2.27 163.09 31.84 15.28 48.00 Total folates (Without MeFox) 64.51 691.24 262.01 84.29 32.17 Supplementary Table S9. List of soybean accessions containing > 400 g/100 g FW of total folates Identification number/ name ZDD14672 Accession type Location/Ecoregion Country Total Folates Landrace SR China 691.24 ZDD12910 Landrace SR China 680.87 ZDD12830 Landrace SR China 680.26 ZDD14683 Landrace SR China 634.05 ZDD02866 Landrace HR China 567.60 ZDD09581 Landrace HR China 549.53 ZDD13689 Landrace SR China 521.96 ZDD14783 Landrace SR China 516.82 WDD01594 Cultivar USA USA 515.93 ZDD06233 Landrace SR China 515.21 ZDD22642 Cultivar NR China 515.18 ZDD10734 Landrace HR China 506.48 ZDD06646 Landrace SR China 504.95 ZDD02277 Landrace HR China 504.13 ZDD02461 Landrace HR China 495.47 ZDD01818 Landrace HR China 487.19 ZDD23650 Cultivar NR China 487.19 ZDD01169 Landrace NR China 483.48 ZDD13815 Landrace SR China 478.77 Youbili Cultivar Russia Russia 475.25 ZDD07197 Landrace NR China 471.24 Ha11-4519 Cultivar NR China 469.09 ZDD14729 Landrace SR China 465.38 WDD00476 Cultivar USA USA 463.04 Ls15 Cultivar USA USA 460.80 ZDD00163 Landrace NR China 453.740 ZDD22798 Cultivar NR China 448.00 ZDD23623 Cultivar NR China 445.81 ZDD10276 Landrace HR China 443.99 ZDD23632 Cultivar NR China 441.32 ZDD00393 Landrace NR China 440.13 ZDD01417 Landrace NR China 439.17 ZDD14052 Landrace SR China 438.21 ZDD06154 Landrace HR China 437.25 ZDD07088 Landrace NR China 430.83 ZDD00717 Landrace NR China 429.80 WDD01607 Cultivar USA USA 429.76 WDD00543 Cultivar USA USA 428.46 ZDD16874 Landrace SR China 426.70 ZDD06816 Cultivar NR China 426.36 ZDD04653 Landrace SR China 426.33 ZDD23615 Cultivar NR China 423.00 ZDD00269 Landrace NR China 419.12 ZDD00159 Landrace NR China 418.19 ZDD00127 Landrace NR China 417.92 ZDD02348 Landrace HR China 417.26 ZDD17542 Landrace SR China 416.22 WDD01253 Cultivar Japan Japan 415.94 ZDD13696 Landrace SR China 415.45 ZDD14780 Landrace SR China 410.62 WDD00631 Cultivar USA USA 410.05 WDD03008 Cultivar USA USA 409.61 16ZF310-5 Cultivar HR China 409.22 ZDD08013 Landrace HR China 407.83 ZDD00303 Landrace NR China 404.33 ZDD07024 Landrace NR China 404.22 ZDD00023 Cultivar NR China 404.11 ZDD24399 Cultivar NR China 401.70 WDD00984 Cultivar USA USA 401.69 Z13-653-1 Cultivar HR China 401.60 NR- Northern Region; HR- Huanghuaihai Region; SR- Southern Region Supplementary Table S10. Correlations between the geographical factors and soybean folates Geographical factors Longitude Latitude Altitude H4fol ate 0.09 1* 0.09 6* 0.08 8* 5-CH3H4folate 5,10CH=H4fola te MeFo x 10-CHOPteGlu 5-CHOH4folate H2fola te Pte Glu 0.165*** -0.065 0.183 *** 0.14*** 0.188*** 0.133 *** -0.117** 0.041 0.081 * 0.126** 0.084* -0.09* -0.02 -0.091* 0.007 * -0.08* 0.057 -0.014 0.04 2 0.00 8 0.06 1 Total folates 0.086* 0.03 0.025 View publication stats Highlights Conjugase treatment was sufficient for soybean folate extraction. 10-fold variation was observed for soybean folates, from 64.51 - 691.24 μg/100 g. 5-CHO-H4folate was the most dominant folate vitamer in soybean. Folates are affected by accession type, ecoregion, and seed morphological traits. Soybean is a key candidate for folate biofortification