1 Experimental Procedures: Accession numbers. Short read sequence data from this article can be found in the NCBI Sequence Read Archive under accession number SRR882054. Plant material and cultivation. C. hirsuta of the reference Ox accession (specimen voucher Hay 1 (OXF) (Hay and Tsiantis 2006) and A. thaliana Col-0 were used unless otherwise stated. The DR5::VENUS line was previously described (Barkoulas et al. 2008). Plants were grown in long day conditions in a greenhouse (18h light at 22°C, 6h dark at 20°C) or in a controlled environment room (16h light at 22°C, 8h dark at 20°C) unless otherwise stated. Dry seeds were either sown on well-watered soil (2:1 peat:vermiculite) in 7x7 cm pots, or sterilized by ethanol washes and sown on plates containing Murashige and Skoog (MS) solid agar medium, covered, and placed at 4°C to stratify for 2-10 days before transfer to culture conditions. Plant transformation efficiency. The percentage of A. thaliana Col-0 and C. hirsuta Ox seedlings that were hygromycin resistant was determined following transformation by floral dip with Agrobacterium tumefaciens (strain GV3101) containing the Gateway vector pMDC32 with a uidA gene insert. Microscopy. Scanning electron microscopy was performed as previously described (Hay and Tsiantis 2006) and an Olympus BX50 was used for light microscopy. Confocal laser scanning microscopy (CLSM) was performed with a Zeiss 510 Meta microscope. BAC libraries. A bacterial artificial chromosome (BAC) library was constructed by Southern Illinois University, Carbondale, IL 62901-6899, USA, with large genomic DNA inserts of C. hirsuta Ox in the vector pIndigoBAC5 (Hind III). A second BAC library was constructed by the Arizona Genomics Institute, University of Arizona, Tucson, AZ 1 2 85721, USA, in the vector pAGIBAC1, which is a modification of pIndigoBAC5 by the addition of a SwaI site, according to the following protocol (Grotewold 2003). A full description of library construction will be given elsewhere. Recombinant inbred line (RIL) population. To initiate a C. hirsuta RIL population, the Wa accession from Clark County, WA, USA (gift from K. Marhold) was selfed twice and then crossed to the Ox accession, which had been selfed eight times, with Ox as the maternal parent. A single F1 progeny was selected and allowed to self-fertilize to generate an F2 population. A total of 195 F2 progeny were propagated through selfing and single-seed descent to generate the F8 RIL population. DNA was extracted from single individuals of all 195 lines when the majority had reached F8 (8 lines were F7 generation). Results of genotyping the F8 showed that 8 lines had unexpectedly high levels of heterozygosity and were permanently discarded to prevent contamination of the population due to inadvertent cross-pollination. To maintain the allele frequency present in this F8 population of 187 RILs, seed from 6 progeny plants was bulked for future use (for 66 lines, less than 6 plants were bulked). Molecular marker design and genotyping. To identify nucleotide polymorphisms for use as molecular markers, primers were designed based on Ox BAC-end sequences to sequence the respective regions in Wa (BigDye® Terminator v3.1). In addition, EST sequence libraries of both accessions were screened for single nucleotide polymorphisms (SNPs). Sequenom assays were designed from selected SNPs and used to genotype each individual from the Ox x Wa RIL population (Welcome Trust Center for Human Genetics, High Throughput Genomics, Oxford, UK). The RILs were genotyped for 46 microsatellite markers by amplifying the locus with a fluorescently labelled primer pair 2 3 and subsequent analysis by ABI3100 capillary electrophoresis and for an additional set of SNPs by pyrosequencing (TraitGenetics GmbH, Gatersleben, Germany). Four PCR based dCAPS markers were added to the genetic map using these primer pairs and restriction enzymes: marker STM_322_1: primers 5'-TTGTTCCTTTTGGCTAGTG-3' and 5'CAAAGATCATGGCTCATCC-3', enzyme Hpy188I; marker AP1: primers 5'- TCCCTAAAACCGCTCTTAGC-3' and 5'-AGAGAGATAAAGAAGAGTTCAGGC-3', enzyme AluI; marker TCP4: primers 5'-TGAGCTTCCTCCTTGGAATC-3' and 5'ACCGAACTGAAGCTGGTGTTGCAG-3', enzyme AluI; marker PSL: primers 5'ATCTTCACGTTGGAGAAGCAGGG-3' and 5'-TTCGATCTTGCAGAACAACTGTA3', enzyme RsaI. Genetic map. The genetic map was produced with Joinmap® 4.0 (Van Ooijen 2006), using allelic information from all F8 RILs. Transmission distortion. 163 F8 seeds from RIL46 were sown and 146 seeds germinated. The number of seeds that failed to germinate did not fit the expectation of 25% for zygotic lethality (Χ2; p<0.001). These plants were genotyped with markers 705_ and 475_3 located at 64.2 and 66.6 cM in a distorted region on chromosome 4 (SSLP marker 705_: primers 5'-GGTTTGTTGATATTGATGGG-3' TGCAGTATAATTGCCTCCTT-3'; dCAPS marker 475_3: and 5'- primers 5'- CACAGAATCGGTACACAAAGGA-3' and 5'-CGTGTGAACTTAGACTGCGATG-3', enzyme apoI). Chromosome preparation and probes for comparative chromosome painting (CPP) and BAC fluorescence in situ hybridization (FISH). C. hirsuta whole inflorescences were fixed in freshly prepared ethanol : acetic acid (3:1) overnight and stored in 70% 3 4 ethanol at –20°C until used. Microscopic preparations with meiotic (pachytene) and mitotic chromosomes were obtained from young anthers as previously described (Lysak and Mandakova 2013, Mandakova and Lysak 2008). Suitable slides were post-fixed in 4% formaldehyde in deionized water for 10 min, air-dried and stored at 4°C until used. A. thaliana BAC clones were obtained from the Arabidopsis Biological Resource Center (ABRC, Columbus, OH) and used as probes for CCP in C. hirsuta. Chromosome-specific BAC contigs were arranged to represent 24 ancestral genomic blocks (A to X) of the putative Ancestral Crucifer Karyotype (Schranz et al. 2006). On average, every third BAC clone was used for each contig from the full list of BAC clones previously described (Mandakova and Lysak 2008). For BAC FISH in C. hirsuta, 40 C. hirsuta BAC clones corresponding to seven genomic blocks (O, P, Q, R, V, W, and X) of CH6 and CH8 were used (Table 1 below). DNA of individual BAC clones was isolated using a standard alkaline extraction omitting the phenol:chloroform purification step. BAC DNA was labeled by biotin-, digoxigenin-, and Cy3- dUTP via nick translation as previously described (Lysak and Mandakova 2013, Mandakova and Lysak 2008). CCP and BAC FISH. Selected slides were treated by RNase (AppliChem; 100µg/mL in water) at 37°C for 1 h, and washed in 2 SSC for 2–5 min. To remove cytoplasm, the slides were treated with pepsin (Sigma-Aldrich; 0.1 mg/mL) in 0.01M HCl at 37°C for 10 min, followed by a wash in 2 SSC for 2–5 min. Subsequently, the slides were postfixed in 4% formaldehyde in 2 SSC for 10 min, washed in 2 SSC (2–5 min), and dehydrated in an ethanol series (70, 80, and 96%). Labeled BAC DNAs were pooled and ethanol precipitated to reduce the probe volume; 500 ng of labeled DNA per BAC clone were used. For one slide, the probe was dissolved in 20 µL of hybridization mix (50% 4 5 formamide, 10% dextran sulfate in 2 SSC) at 37°C overnight. The probe and chromosomes were denatured together on a hot plate at 80°C for 2 min and incubated in a moist chamber at 37°C for 48 h. Post-hybridization washing was performed in 20% formamide in 2 SSC for 3 5 min at 42°C. Detection of labeled DNA was done as previously described (Lysak and Mandakova 2013). Chromosome preparations were counterstained with DAPI (2 µg/mL) in Vectashield (Vector Laboratories), observed and photographed using an Olympus BX-61 epifluorescence microscope. Monochromatic images were acquired separately for all fluorochromes using appropriate excitation and emission filters (AHF Analysentechnik) using an AxioCam CCD camera (Zeiss), and pseudo-coloured and merged using Adobe Photoshop CS2 software (Adobe Systems). Pachytene chromosomes in Figure 3 were straightened using the “straighten-curvedobjects” plugin in the Image J software (Kocsis et al., 1991). Physiology. For auxin sensitivity experiments, seedlings were grown on vertical MS plates supplemented with indole-3-acetic acid (IAA, Sigma) at the concentrations indicated in Figure 5. Alternatively, seedlings were transferred to IAA-containing media at 4 DAG and (1) stained with propidium iodide (25 μg/ml) after 40 minutes and analysed for DR5::VENUS expression by CLSM, or (2) root meristem size measured after 9-and 24-hr. Root meristem size is expressed as the number of cortex cells in a file extending from the quiescent center to the first elongated cortex cell, as previously described (Dello Ioio et al. 2008). To assay hypocotyl elongation response to simulated shade, A. thaliana and C. hirsuta seedlings were germinated and grown on horizontal MS plates for 3 days under white light (W) and then either kept in W or transferred to far-red supplemented lighting for 4 more days. At day 7, hypocotyls were measured from at least 5 6 30 seedlings for each treatment and species. To assay hypocotyl elongation response to gibberellic acid (GA3, Sigma), seedlings were germinated and grown in W for 10 days on horizontal MS plates supplemented with GA3 at the concentrations indicated in Figure 5. At least 10 seedlings were measured for each treatment and species. Morphology and Histology. Petal shape was determined from measurements of petals at floral stages 13-15 viewed under light microscopy. Seed area was measured by fitting an ellipse to the contour of photographed seeds using the Image-J macro "DrawEllipse" (National Institutes of Health) and weight was determined from batches of counted seeds. Roots of C. hirsuta and A. thaliana seedlings 5 days after germination (DAG) were (1) fixed, embedded in paraffin, and 8 μm sections through the differentiation zone stained with toluidine blue-0 as previously described (Grigg et al. 2005), (2) stained with a modified pseudo-Schiff propidium iodide method as previously described (Rast and Simon 2012) and root meristems viewed with a 40x lens by CLSM. A scoring diagram for shoot branching is described in Figure S7. QTL analysis. QTL analysis of stamen number variation was performed in the Ox x Wa RIL population. Three progeny plants of the genotyped lines and 15 replicates of each founder accession were grown in the greenhouse. Flowers were removed from the plant when they opened and the number of stamens counted with the assistance of a dissecting microscope. At least 24 flowers were sampled per RIL with the exception of 3 RILs for which fewer flowers were scored. Average stamen number was calculated per plant and used to determine the broad sense heritability (H2) of the trait. QTL analysis was performed on the mean average stamen number per RIL, including the 8 F7 lines, using Genstat 15th edition (Payne 2010). Genetic predictors were calculated with no more than 6 7 2cM distance between them. A genome-wide significance threshold for the QTL scan was determined to be 3.39 –log10(P) for α=0.05, using the method for Meff described in (Li and Ji 2005). The QTL mapping methodology involved fitting a mixed effects model in which the QTL were fixed effects and the genotypes were random effects. A simple interval mapping genome scan was followed by several rounds of composite interval mapping during which co-factors were added or removed until no further improvement could be made. A final QTL model was fitted that included all detected QTL to determine the allelic effects and the total phenotypic variance explained. The variance explained per QTL was estimated by sequentially dropping each QTL from the model. The difference in variance explained by the reduced model versus the full model was attributed to the respective QTL. Epistatic interactions between the detected QTL were analysed by backwards selection from a model with all additive QTL and all possible pairwise combinations between them. At each round of selection the least significant interaction was removed until only significant interactions (α=0.05) remained. A final epistatic QTL model was fitted that contained all QTL as additive terms plus the significant interactions between them to estimate the variance explained. High-throughput sequence analysis. RNA was isolated from C. hirsuta shoots of Ox and Wa accessions following floral induction. The Ox and Wa samples were used for sequencing on the Illumina HiSeq 2000 platform (WTCHG Oxford Genomics Centre). RNAseq reads from Ox and Wa accessions were mapped to A. thaliana reference genome (TAIR10) using STAR (Dobin et al. 2013). Raw counts for each gene were measured with HTSeq version 0.5.4p3 (http://www-huber.embl.de/users/anders/HTSeq/) using the function "-stranded=no -mode=intersection-strict -t CDS". Differential expression was 7 8 called using these counts with DESeq version 1.10.1 (http://www- huber.embl.de/users/anders/DESeq) with a FDR of 5% (Figure S8). Fold changes in expression were reported after correcting for differences in library sizes. The RNAseq alignment files for two biological duplicates of the same accession, generated using STAR in the preceding analysis, were merged together to generate accession based alignment files in bam format. A preliminary scan was performed separately on each accession's bam file to identify SNPs against TAIR10 using samtools (Li et al. 2009). The SNP sites identified in both accessions were pooled together as candidate sites. At each candidate site, the read count supporting each allele was compared between the Ox and Wa accessions using Fisher’s exact test and calls that were not significantly different were rejected. The program "mcmerge dscmp" from the software package IMR/DENOM was used for this analysis with a cutoff of Phred score at 20 (Gan et al. 2011). The Ox sample was also used for one sequencing run on a Roche Genome Sequencer FLX (Liverpool Centre for Genomic Research) and gave 515,571 clean sequence reads with mean read length of 246.2-bp. De novo assembly yielded 22,496 contigs longer than 200-bp and 7665 of these contigs were longer than 500-bp (N50 = 585). To improve the assembly we used the 22,496 contigs as a BLAST query against the annotated A. thaliana coding sequences from TAIR9. This gave 12,072 alignments with A. thaliana unigenes and a distribution of much longer C. hirsuta cDNA contigs (Figure S9). GO annotation of C. hirsuta cDNAs was close to 100% for most contigs, but considerably lower for shorter sequences (Figure S10). Nucleotide divergence between C. hirsuta and A. thaliana cDNAs differed dramatically between the 1st, 2nd and 3rd codon positions, as expected for protein coding genes (Figure S11). 8 9 Phylogeography. 163 accessions from the putative native range of C. hirsuta and 15 accessions from the introduced range were sampled from herbarium specimens, field collected silica-dried material, fresh material from plants grown from seed and 4 DNA extractions (see Table S2 for details). A. thaliana herbarium specimens included two Ethiopian specimens from Wageningen University (de Wilde, J.J.F.E. 6945 and Wieringa, J.J. 4966) and two Moroccan specimens from the University of Reading (Jury, S.L. 14175 and Monserrat, J.M. 2379/94). Total genomic DNA was extracted using the DNeasy Plant Mini Kit (Qiagen, Valencia, CA). PCR amplification of the chloroplast ndhF-rpl32 intergenic spacer was performed using the previously described primers ndhF and rpL32-R (Shaw et al. 2007) and internal primers designed for this study: ndhFrpl32_FI 5’-AGAATAKAACAAAGATTTACAC-3’ and ndhF-rpl32_RI 5’- TCACTTAGTGTACTGRAAGACAA-3’. The primers were used in the following combinations: ndhF-rpl32_FI and rpL32-R; ndhF and ndhF-rpl32_RI to amplify two fragments with 167-bp overlap in C. hirsuta (Ox). A portion of the 5’ flanking region of the nuclear Atmyb2 gene was amplified using the primers ATM1 and ATM4 from (Beck et al. 2008). The IVS1 intron of the nuclear MADS-box gene PISTILLATA was amplified using the primers PI-ITF and PI-ITR from (Lee et al. 2002). PCR was performed in a total volume of 15 μL, containing 1x PCR buffer, 2.3 mM MgCl2, 0.2 μM of each primer, 0.3 mM of each dNTP, 0.67 mg/ml bovine serum albumin, 0.4 μL Mango Taq (Bioline), and 1.5 μL template DNA. PCR conditions were as follows: for chloroplast primers; 80°C for 5 min followed by 30 cycles each consisting of 95°C for 1 min, 47-49°C for 1 min, a ramp of 0.3°C/s to 65°C, 65°C for 4 min, followed by a final extension step at 65°C for 5 min; for Atmyb2 primers: 94 °C for 3 min then 35 cycles of 94 °C for 1 min, 9 10 50 °C for 1 min and 72 °C for 2 min, then a final extension for 6 min at 72 °C; and for PISTILLATA primers: 94°C for 2 min followed by 35 cycles each consisting of 94°C for 30 s, 55-58°C for 30 s, 68°C for 1 min; and a final extension step at 68°C for 5 min. Both strands were sequenced using BigDye v 3 (Applied Biosystems) and products were analysed on a 3730x1 DNA Analyzer (Applied Biosystems). Sequences were assembled using Sequencher v 4.5 (Gene Codes Corporation), A. thaliana sequences were trimmed to sequences downloaded from GenBank (Atmyb2 EF552847-EF553318 and PISTILLATA EF594536-EF594739), and aligned by eye in MEGA 5.05 (Tamura et al. 2011). Length variable mononucleotide repeats were removed from the alignment as they are potentially homoplastic. Indels longer than 1 bp were treated as single-step events and recoded as single characters. The relationship among haplotypes was reconstructed using statistical parsimony with a 95% connection limit (Templeton et al. 1992) and gaps as a 5th state in TCS v 1.21 (Clement et al. 2000). One reticulation was deleted from the haplotype network in Figure 7 between haplotype 17 and an intermediate haplotype between haplotypes 5 and 11, following the criteria in (Crandall and Templeton 1993). References Barkoulas, M., Hay, A., Kougioumoutzi, E. and Tsiantis, M. (2008) A developmental framework for dissected leaf formation in the Arabidopsis relative Cardamine hirsuta. Nature Genetics, 40, 1136-1141. Beck, J.B., Schmuths, H. and Schaal, B.A. (2008) Native range genetic variation in Arabidopsis thaliana is strongly geographically structured and reflects Pleistocene glacial dynamics. Molecular ecology, 17, 902-915. Clement, M., Posada, D. and Crandall, K.A. (2000) TCS: a computer program to estimate gene genealogies. Molecular ecology, 9, 1657-1659. Crandall, K.A. and Templeton, A.R. (1993) Empirical tests of some predictions from coalescent theory with applications to intraspecific phylogeny reconstruction. Genetics, 134, 959969. Dello Ioio, R., Nakamura, K., Moubayidin, L., Perilli, S., Taniguchi, M., Morita, M.T., Aoyama, T., Costantino, P. and Sabatini, S. (2008) A genetic framework for the control of cell division and differentiation in the root meristem. Science, 322, 1380-1384. 10 11 Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M. and Gingeras, T.R. (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29, 15-21. Gan, X., Stegle, O., Behr, J., Steffen, J.G., Drewe, P., Hildebrand, K.L., Lyngsoe, R., Schultheiss, S.J., Osborne, E.J., Sreedharan, V.T., Kahles, A., Bohnert, R., Jean, G., Derwent, P., Kersey, P., Belfield, E.J., Harberd, N.P., Kemen, E., Toomajian, C., Kover, P.X., Clark, R.M., Ratsch, G. and Mott, R. (2011) Multiple reference genomes and transcriptomes for Arabidopsis thaliana. Nature, 477, 419-423. Grigg, S.P., Canales, C., Hay, A. and Tsiantis, M. (2005) SERRATE coordinates shoot meristem function and leaf axial patterning in Arabidopsis. Nature, 437, 1022-1026. Grotewold, E. ed (2003) Plant Functional Genomics: Methods and Protocols Totowa, NJ: Humana Press. Hay, A. and Tsiantis, M. (2006) The genetic basis for differences in leaf form between Arabidopsis thaliana and its wild relative Cardamine hirsuta. Nat Genet, 38, 942-947. Lee, J.Y., Mummenhoff, K. and Bowman, J.L. (2002) Allopolyploidization and evolution of species with reduced floral structures in Lepidium L. (Brassicaceae). Proc Natl Acad Sci U S A, 99, 16835-16840. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R. and Genome Project Data Processing, S. (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25, 2078-2079. Li, J. and Ji, L. (2005) Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity, 95, 221-227. Lysak, M.A. and Mandakova, T. (2013) Analysis of plant meiotic chromosomes by chromosome painting. Methods in molecular biology, 990, 13-24. Mandakova, T. and Lysak, M.A. (2008) Chromosomal phylogeny and karyotype evolution in x=7 crucifer species (Brassicaceae). The Plant Cell, 20, 2559-2570. Payne, R.W., Murray, D.A., Harding, S.A., Baird, D.B. & Soutar, D.M (2010) An Introduction to GenStat for Windows (13th Edition). Hemel Hempstead, UK: VSN International. Rast, M.I. and Simon, R. (2012) Arabidopsis JAGGED LATERAL ORGANS acts with ASYMMETRIC LEAVES2 to coordinate KNOX and PIN expression in shoot and root meristems. The Plant Cell, 24, 2917-2933. Schranz, M.E., Lysak, M.A. and Mitchell-Olds, T. (2006) The ABC's of comparative genomics in the Brassicaceae: building blocks of crucifer genomes. Trends in Plant Science, 11, 535542. Shaw, J., Lickey, E.B., Schilling, E.E. and Small, R.L. (2007) Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: the tortoise and the hare III. American Journal of Botany, 94, 275-288. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M. and Kumar, S. (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular biology and evolution, 28, 27312739. Templeton, A.R., Crandall, K.A. and Sing, C.F. (1992) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics, 132, 619-633. Van Ooijen, J.W. (2006) Joinmap® 4, software for the calculation of genetic linkage maps in experimental populations. Wageningen, Netherlands.: Kyazma B.V. 11