Supporting Information Legends Data S1. Differential expression between bm2 and the wild-type The description of the columns in this file: GeneID: gene ID Ref: version of the reference genome Chr: chromosome Ori: gene orientation (either forward (+) or reverse (-) strand) Start: the first physical position of the gene on the chromosome End: the last physical position of the gene on the chromosome ExonSize: total length of all the gene’s annotated exons sample_rep1/3: raw read counts for the gene of the samples (bm2mut_rep1, bm2wt_rep1, bm2mut_rep3, bm2wt_rep3) sample.RPKM: normalized read counts of a given gene (“RPKM” means reads per kb exonic sequence per million uniquely mapped reads) in the samples (bm2mut_rep1, bm2wt_rep1, bm2mut_rep3, bm2wt_rep3) bm2mut.bm2wt_log2FC: log2 of fold change between the bm2 non-mutant pool and the mutant pool bm2mut.bm2wt_pvalue: p_value of the statistical test for differential expression of this gene between the bm2 non-mutant pool and the mutant pool bm2mut.bm2wt_qvalue: the corrected p-values (q_values) for differential expression of this gene after correcting for multiple testing bm2mut.bm2wt_sig: the answer to the question “Is this gene significantly expressed?” The gene with the q_value smaller than 0.05 (FDR 5%) was labeled with “yes”. description: description of genes Figure S1. Identification of Mutator insertion alleles in the bm2 locus and qRT-PCR results A) Gene structure of the MTHFR gene. The insertion sites of multiple Mu transposons are indicated by the open triangles in the first exon. The loci for annealing of primers for identifying Mutator insertion alleles in the bm2 are indicated. B) Sequences of the primers used for the identification of Mutator insertion. Each of the primers (p1-5) was paired with MuTIR to amplify the Mu flanking sequences C) Summary of the PCR results of the identification of Mutator insertion with the MuTIR and each of the primers (p1-5). D) Plot of the qRT-PCR results. Standard error is based on two biological reps of midrib samples from 2-4 individual plants. Concentration is relative to B73, which is set to 1. Figure S2. bm2 encodes a putative MTHFR A) Representing phenotype of maize containing a Mu insertion in the putative MTHFR gene (bm2-Mu). B) Adaxial and abaxial views of midrib of non-mutant (WT) maize and bm2-Mu maize (bm2-Mu-112251). C) Histochemical staining of lignin of midrib tissue sections from bm2-Mu plants. Scale bar: 100μm. Figure S3. Histochemical staining of lignin of tissue sections from wild-type and bm2-Mu mutant. Sections of midrib, stem and root were taken from wild-type individuals (WT) (a, c, e) and bm2-Mu mutant (b, d, f) (bm2-Mu-10-7067E) and stained with phloroglucinol. (Scale bar: 100μm) Figures S4. Polymorphisms in the bm2-ref allele as compared to the Bm2-B73 wild-type allele A) Gene structure of the MTHFR gene. Black boxes indicate exons and lines between the boxes indicate introns. Stars indicate the sites with polymorphisms between the alleles of bm2-ref and Bm2-B73. B) The sequence alignment of the 3’ UTRs of the Bm2-B73 allele and the bm2-ref allele. The 3’ UTR includes 1856 to 1373 base of the mRNA. The stop codon at the MTHFR encoding sequence (TGA) was underlined in red. The red star indicate point mutations. The blue star indicates an insertion. The green star indicates a deletion. Figure S5. Alignment of deduced amino acid sequences of the Bm2 gene with the sequence of MET11 from Saccharomyces cerevisiae Figure S6: Histogram of p-values for differential expression tests QuasiSeq was used to test the null hypothesis that expression of a given gene is not different between the two groups. A p-value was obtained for each informative gene. The distribution of p-values under the null hypothesis (no differential genes exist) is a uniform distribution in the range of 0-1. More than the expected number of p-values with small values indicates that significantly differentially expressed genes could be statistically identified. Figure S7: Volcano plot from RNA-Seq The volcano plot compares gene expression patterns between two groups. Negative log10 p-values from the differential expression test were plotted against the log2 fold change for each informative gene. Each dot represents a gene, plotting with 20% transparency. The horizontal dash line indicates the 5% FDR cutoff. Figure S8: MA-similar plot from RNA-Seq The MA-similar plot provides an overview of the differential level between groups of the comparison. Log2 fold change (y-axis) of each informative gene was plotted against log2 of mean of expression (xaxis). Significantly differentially expressed genes are highlighted in red. Figure S9. Comparison of fold-changes between two RNA-Seq experiments on 369 DEGs 369 significantly differentially expressed genes (DEGs) identified from the first RNA-Seq were shown in the figure. After we conducted the second RNA-Seq with biological replicates, 3,313 significantly DEGs were identified. Among 369 DEGs from the first RNA-Seq, 323 DEGs (red) were shown as DEGs in the second RNA-Seq experiment. Figure S10: Overview of differential expression in the metabolic pathway in MapMan MapMan (mapman.gabipd.org) provides a useful tool to visualize the alteration of gene expression in the comparison. Differential expression in the metabolic pathway was shown as an example. Each square represents a transcript. The squares were color-coded by log2 fold change between the bm2 non-mutant pool and the mutant pool from the RNA-Seq data. The up- and down-regulated genes in the bm2 mutant pool relative to the non-mutant pool were highlighted in blue and red, respectively. Figure S11: Phylogenetic tree of maize MTHFR conserved region A phylogenetic tree for the conserved region of MTHFR (residues 167-350 of the BM2 protein sequence) was constructed by identifying orthologs using NCBI’s BLASTP on the maize MTHFR protein sequence. Eighteen land plant RefSeqs were selected for use in the tree along with a consensus sequence. The tree was built using Blosum62 neighbor joining in Jalview (http://www.jalview.org/). For each sequence, the Genbank accession number is given, followed by “|” and then a simplified name. For the simplified names, 1 and 2 distinguish between multiple paralogs. _a and _b indicate duplicate copies of the same gene. BM2 is boxed in green. Figure S12: qTeller expression pattern of bm2 Expression pattern for maize bm2 from qTeller (qTeller.com). The RNA-seq data provided by qTeller (both published and unpublished) comes from a variety of sources throughout the maize community. More information on the origin of the data sets and in-house analysis of the data can be found on the qTeller website. Table S1: Genes at the 2 Mb interval as induced by BSR-Seq Table S2. Fine Mapping primer sequences for KASPar Assay Table S3. 47 genes in the 0.51 MB bm2 interval (working gene set) Table S4. 8 genes in the 0.51MB bm2 interval (filtered gene set) Table S5. Trimming and alignment summary of RNA-Seq #1 Table S6. Read trimming summary of RNA-Seq #2 Table S7. Alignment summary of RNA-Seq #2 Table S8. Overall-represented pathways (Mapman) Table S9. Genes in the phenylpropanoid pathway that did not exhibit significantly differential expression