Supporting Information Khakimov et al. Supporting Figures Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment was used to construct the phylogenetic tree in Fig 2. Figure S2. B. vulgaris QTL map showing 17 linkage groups by vertical bars, aligned to the five A. thaliana chromosomes. QTLs for flea beetle resistance, saponins of the G- and P-type, and glucosinolates are indicated by vertical colored bars, and positions of the six mapped genes for saponin biosynthesis by circles. Genetic distances (cM) are listed at the left of each linkage group and genetic markers (AFLP, SSR) at the right; corresponding marker positions on the A. thaliana chromosomes are indicated in millions of base pairs (Mb). QTLs for the four unknown saponins from the P-type are marked 1 to 4; see Kuzina et al. (2011) for more details; two of these were dominant, as were two QTLs for glucosinolates. Confidence intervals, explained variation and position of the maximal LOD score are in Table S2. AFLP markers are designated with the MseI (m) and PstI (p) primer combinations followed by molecular weight (bp). Figure S3. EI-MS fragmentation patterns and retention times (RT) of trimethylsilyl (TMS) derivatives of 13 authentic triterpene standards and five tentatively identified triterpenoids (unknowns 1-5). Tentatively identified triterpenoids correspond to peak numbers as illustrated in Fig. 3. EI-MS patterns, RT and structures of (a) β-amyrin, (b) α-amyrin, (c) lupeol, (d) erithrodiol, (e) uvaol, (f) betulin, (g) oleanolic acid, (h) oleanolic aldehyde, (i) betulinic acid, (j) betulinic aldehyde, (k) ursolic acid, (l) ursolic aldehyde, (m) hederagenin were compared against authentic standards. Tentative identification of triterpenoids (unknowns 1-5 (n-u)) were based on common EIMS fragmentation patterns observed for most trimethylsilylated triterpenes and mass spectral comparison with authentic standards. Characteristic fragmentation patterns represented by loss of methyl (-15 m/z), followed by loos of dimethylsilyl oxonium (-75 m/z) and loss of the second methyl (-15 m/z) group from the final trimethylsilylated products (v) and allowed estimation of molecular masses of unknowns 1-5 (Table S8). Figure S4. LC-MS/MS spectra of tentatively identified saponins produced by expressing (a) LUP5G + CYP716A80 + UGT73C11 and (b) LUP2P + CYP716A80 + UGT73C11 in N. benthamiana, as illustrated in Fig. 4 left (LUP5G) and right (LUP2P) panels, respectively. Mass spectra numbers correspond to peak numbers in Fig. 5. Mass spectra were recorded in negative mode, thus proposed aglycone masses correspond to [M – 1] -. Figure S5. Saponins produced by expression of different combinations of B. vulgaris genes coding for OSCs, P450s and UGTs in N. benthamiana plant leaves. Panels (a-c) show LC-MS profiles for (1) (a) LUP2G, (b) and (c) LUP2P; (2) LUPs in combination with (a) CYP716A80, (b) CYP716A81, and (c) BvCYP716A80; (3) LUPs and CYPs in combination with (a) UGT73C11, (b) UGT73C13, and (c) UGT73C11. (2A) – (3D) EICs of (2) representing saponins with (2A) five sugar moieties (m/z 1266-1268, 1308-1310, 1380-1382, 1418-1420); (2B) four moieties (m/z 10881091, 1105-1110, 1145-1147, 1190-1192) (2C) two moieties (m/z 829-831, 901-903, 1564-1566) and (2D) one moiety (m/z 617-619, 666-668, 1236-1238, 1413-1415, 1408-1411). The y axis (ion count) of each chromatogram is scaled to the highest peak. Peaks highlighted with asterisk (*) correspond to saponins that significantly decreased or disappeared after sodium hydroxide-based saponification, indicating presence of sugar moieties at the C28 position of the triterpene aglycones. See Tables S5-S7 for more detailed information on detected saponins from these enzyme combinations. Supporting Tables Table S1. Genetic markers mapped in the 17 B. vulgaris linkage groups. Markers Ra12, 5 and 12 were designed for Rorippa and Bv6 for a B. vulgaris LEAFY gene (Kuzina et al. 2011). Reverse primers are designated with R; forward (F) primers contain M13 tail: 5' to 3' CACGACGTTGTAAAACGAC for fluorescent labeling during PCR. SSRIT (Simple Sequence Repeat Identification Tool, http://www.gramene.org/db/markers/ssrtool) was used to search for SSRs. Table S2. QTL positions and effects on resistance to flea beetle larvae, saponins, and glucosinolates. Link. group indicates linkage group number, Expl. var. explained variation in %, and Max. LOD position the position of the maximal LOD score. Metabolite numbers are as on Fig. S2 and in Kuzina et al. (2011). Table S3. List of triterpenoid saponins detected in N. benthamiana leaves when expressing LUP5G + CYP716A80 + UGT73C11. For each peak is given the peak number (corresponding to Fig. 5(a)), aglucone mass, predicted aglucone, and comments on their possible structure. Based on LC-MS/MS (Fig. 5a and Fig. S4a) and GC-MS (Fig. 3 and 4), triterpenoid aglycones with MW of 456 Da most likely correspond to oleanolic acid (456 Da), because this enzyme combination mostly produced oleanolic acid (74%). Triterpenoid aglycones with MW of 458 Da most likely correspond to unknown-4 and 472 Da to unknown-5. *Peaks correspond to saponins that are significantly decreased or disappeared after saponification, indicating presence of sugar moieties at the C28 position of the triterpene aglycones. ** indicate oleanolic acid mono-glucoside. Table S4. As Table S3, but for LUP2P + CYP716A81 + UGT73C11. Peak numbers correspond to Fig. 5(b). Based on LC-MS/MS (Fig. 5b and Fig. S6a,b) and GC-MS (Fig. 3 and 4), triterpenoid aglycones listed in this table with MW of 458 Da most likely correspond to unknown-4, and aglycones with MW of 472 Da to unknown-5. ** indicates possible mono-glucosides of unknown4. Table S5. As Table S3, but for LUP2G + CYP716A80 + UGT73C11. Peak numbers correspond to Fig. S5(a). Triterpenoid aglycones listed in this table with MW of 456 Da most likely correspond to oleanolic, ursolic, or betulinic acids, which are produced by this enzyme combination, and aglycones with MW of 458 Da correspond to unknown-4 and aglycones with MW of 472-5. Table S6. As Table S3, but for LUP2P + CYP716A81 + UGT73C13. Peak numbers correspond to Fig. S5(b). Triterpenoid aglycones listed in this table with MW of 458 Da most likely correspond to unknown-4 (MW of 458 Da) and aglycones with MW of 472 Da correspond to unknown-5 (MW 472 Da). This is in agreement with the GC-MS analysis of the LUP2P + CYP716A81 enzyme combination where 51% of the triterpenes correspond to the unknown-4 and 26% of them to unknown-5 (Fig. 4 and Fig. S3). * Peaks corresponding to saponins that significantly decreased after sodium hydroxide based saponification indicating presence of sugar moieties at the C28 position of the triterpene aglycones. Table S7. As Table S3, but for LUP2P + CYP716A80 + UGT73C11. Peak numbers correspond to Fig. S5(c). Triterpenoid aglycones listed in this table with MW of 456 Da most likely correspond to betulinic acid (MW of 456 Da), the main triterpenoid produced when LUP2P was co-expressed with CYP716A80 (Fig. S5). Aglycones with MW of 472 Da are probably derived from to unknown5 (MW 472 Da), and three saponins with MW of 458 Da are most likely derived from unknown-4. This is in agreement with the GC-MS analysis of LUP2P + CYP716A80 where 67% of the total triterpenes correspond to betulinic acid and 23% to unknown-5 (Fig. 4). *Peaks correspond to saponins that significantly decreased after sodium hydroxide based saponification indicating presence of sugar moieties at the C28 position of the triterpene aglycones. ** Correspond to betulinic acid mono-glucoside. Table S8. Tentative identification of the triterpenoid saponin aglycones of unknowns 1-5 produced by transient expression of OSCs and P450s in N. benthamiana leaves. Tentative identification is based on retention time, mass spectral interpretation and comparison with spectral data of similar triterpenes. Table S9. Primers used for cloning (LUP2: Primer1 + Primer2, LUP5: Primer3 + Primer5, CYP716A: Primer 5 + Primer 6) and gene expression analyses (LUP2: Primer7 + Primer8, LUP5: Primer9 + Primer10, CYP716A: Primer11 + Primer12)