Reported below is the progress in the first year of this project. The progress updates are organised according to the three objectives; associated milestones in this period are noted. Objective 1: To improve the resolution of recently discovered QTL for water and phosphorus-use efficiency in Brassica oleracea by the fine mapping of introgressions in substitution lines QTL for WUE: SL118 In a preceding DEFRA project (JH0439/IF0168) we took lines previously found to have more efficient water use efficiency (AGSL118) and worked to improve the QTL resolution. The project resulted in production of 72 B. oleracea lines derived from the BC1F3 generation from the cross AGSL118 x A12. This work identified two interesting introgressions of GD33 on chromosomes 1 and 7 (C1 and C7) that were segregating in these 72 lines. A third introgression in SL118 on chromosome 6 appeared not to affect WUE. In order to further resolve the region(s) responsible for the improvement in the trait we first characterised the 72 lines with 6 additional markers (see Appendix 1). These markers are: ‘At3g03380’ and ‘At3g01180’ that map to the C1 GD33 region and have been provided by a BBSRC funded project (BB/E006418/1 - Prof. Bill Finch-Savage) which is characterising the same region of C1 using AGSL101. The markers BRAS019, Na12F03, FITO302 and A48350 map within the C7 GD33 introgression and were identified by the preceding DEFRA bridging project IF0168. Although 6 markers were available to screen the introgressed GD33 region, there was potentially a marker gap of up to 20 cM between the most centromeric GD33 result and the next A12 result. The intervening region is consequently uncharacterised and the introgression boundary poorly defined. It is possible the introgression is shorter in some of the BC1F3 lines of SL118 than in the original SL118, and also, some of the BC1F3 lines may have lost the region associated with increased water use efficiency. Water use efficiency was measured in the BC1F3 lines, but logistical constraints meant that only 5 of the 72 lines could be accurately phenotyped alongside the recurrent parent, A12. A previous DEFRA project (AT0120/HH3608TX) detected a significant WUE QTL on C7 hence this region was given most attention when selecting plants for phenotyping. Four lines were selected with a single C7 introgression to cover the possibility that some might have a shortened introgression. We also analysed a line with both C1 and C7 introgressions. Results of the physiological WUE characterisation have shown that the line with the C1 + C7 double introgression has significantly lower water use efficiency than the A12 parent and any of the single C7 introgression lines. This suggests that there are effects of genes in these two regions that may be acting epistatically. The phenotypic data confirmed that the loss of C6 introgression has no effect on WUE and thus this region can be disregarded in future studies. The variability in the four single C7 candidate lines indicates that they are probably not genetically Fig 1. WUE measurements of BC1F3 lines. WUE in BAT070 identical, but they could be useful in (C1&C7) is significantly different than A12 and similar to the elucidating the ‘functional’ region in the original AGL118 line; WUE in other lines (single C7) is not significantly different to A12. marker-undefined segment. Our work so far highlighted an urgent requirement for more markers (polymorphic between A12 & GD33) within this uncharacterised region on C7: we are now aligning publicly available 1 Brassica genetic maps from diverse sources aiming to identify molecular markers that are likely to fall within our target region of C7. In addition to this approach, we have sequenced the transcriptome of AGSL118 using an Illumina genome analyzer. These data were then combined with data from the two parents A12 and GD33, which was available from another project to enable polymorphisms and new markers to be identified specifically within the SL118 introgressed regions. The replicated nature of the B. oleracea genome means it is necessary to distinguish between paralogous loci and our target loci. By comparing the three datasets against contigs derived from a de-novo assembly of A12 mRNA, single nucleotide polymorphisms (SNPs) between the three lines could be ascertained. In the majority of cases the SNPs identified were those where A12 and AGSL118 contained the same base. However, in a minority of cases, GD33 and SL118 were identical and differed from A12, suggesting that these SNPs were from the introgressed region. Over 50 such SNPs were found and 5 were tested by PCR on GD38, A12, SL118 and 8 potential segregants. Out if the four PCRs that produced a product, all four proved to be genuine SNPs between AGSL118 and A12. The software is being improved to find more markers in the region and the SNPs found are being used to map recombinants. We can now move to genotyping with the new markers in a high throughput manner. We are now crossing all lines with either the C1 introgression or C1 plus different parts of the C7 introgression to A12 to allow us to refine the region conferring higher WUE. We are also crossing BC1F3 plants to A12 in order to generate plants with parts of the C7 introgression to allow us to test for epistatic interaction effects. Following this we will undertake a more detailed phenotypic analysis to analyse the water use efficiency. In parallel to this project we are testing whether orthologs of Arabidopsis genes that have been found to associate with water use efficiency in Arabidopsis have a similar function in Brassica (project WU0116). This is being done by attempting to associate particular candidate gene alleles with differences in WUE recorded in field-grown Brassica diversity sets. QTL for PUE: SL134 The genomic region associated with high phosphorus-use efficiency (PUE) on C3 in Brassica oleracea is currently being refined using AGSL134. It was found in this work that AGSL134 had introgressions on C1, C2 and C3. To break up the introgression on C3 and remove non-target introgressions on C1 and C2 a number of backcrosses to the parental A12 line have been generated. We started with A12 x AGSL134 F2 backcross seed lines (from project HH3501SFV) and genotyped the BC 1F2 progeny using one SSR marker for each introgression on C1 and C3 (CB10097 and BRMS071); note there is a further introgression on C2 for which there are no PCR based markers). In addition, BC2F1 plants are being grown to generate BC2F2 seed. A small scale SSR marker screen reveals we have 2 BC2F1 plants that have lost their introgressions on C1 and are heterozygous for the C3 introgression, plus 2 further BC2F1 plants that may also have lost their introgressions on C1 and are heterozygous for a possibly truncated C3 introgression. We are developing new markers for the C2 region using our own sequencing analysis (as for SL118) and using VeGIN SNP markers. Associated milestones: (M1ib) The specific introgression in SL118 that is responsible for conferring low WUE and PUE determined (complete, now refining to test the possible epistatic interaction) (M1ii.a) BC2F2 seed for SL134 obtained for future fine-mapping work (complete) Objective 2: To identify new loci and alleles associated with water and phosphorus-use efficiency in a model plant genome by using most recent advances in allelic association mapping in Arabidopsis thaliana Collection of WUE and PUE trait data from more balanced diversity sets 2 Our previous Arabidopsis trait/GWAM work has been using the ‘Nordborg 96’ Arabidopsis thaliana ecotypes. In order to make better use of the now 1,300 241k SNP-genotyped accessions and to reduce the effects of linkage disequilibrium we have determined a more balanced population of 96 ecotypes. This will enable more rapid selection of candidate GWAM-hits. The selection of a new ecotype set was informed by our previous GWAM experiments on phosphorus use efficiency, water use efficiency, and nitrogen-regulated root architecture. The new ecotypes were chosen to best distinguish and refine current GWAM regions. To provide the material for the balanced diversity set we aimed to bulk up seeds from all 1,300 SNP-genotyped accessions. Despite genotyping information being available, many lines were not yet publically available. Those that were (mostly in the larger Nordborg collection or the Borevitz collection) available from the NASC stock centre were obtained and seed bulked. This resulted in a set of 263 lines. To identify the more balance diversity set, first the GWAM results for the three most important traits in each of the three categories of nutrient use efficiency (P-accumulation), water use efficiency (gravimetric water use, 13C content, biomass) and nitrogen-regulated root architecture (total primary+lateral root length, lateral root density, shoot area to total primary+lateral root length (a measure of relative biomass accumulation)) were combined. From these results a target list of 6-10 SNPs was selected from each category, comprising SNPs which were significant for several individual traits or which were particularly highly significant for one trait. For each SNP on the shortlists, we then searched for SNPs which gave a similar pattern across the original set of accessions as the target SNP, and so were also significant, but which were not closely linked to the target SNP. These SNPs with similar patterns but which are not closely linked give rise to ambiguity as to the location of the gene driving the differences. From the three categories this resulted in 51 pairs of SNPs. We then searched the list of genotyped accessions (and those from which we had bulked seed) for accessions that would tend to increase the differences in the patterns produced by these pairs of SNPs, and selected as additional accessions for further phenotyping those accessions that were most helpful in increasing the differences between these 51 pairs of SNPs. The final list of new ecotypes (with 24 that overlap with the Nordborg 96) is provided as Appendix 2. We are currently in the process of collecting data from a controlled environment cabinet data designed to yield new phosphorus and water use efficiency data from these 96 ecotypes. We are continuing to search for new ways to improve our association mapping analysis (GWAM) methods and will use an optimal method to analyse the new phenotypic data. Associated Milestones: (M2ii.a) New, more balanced Arabidopsis populations for GWAM studies determined (complete) Investigating root traits for resource use efficiency We have been carrying out a GWAM experiment using the 96 Nordborg Arabidopsis thaliana ecotype set to characterise root architecture and growth responses to combinations of nitrogen (N) and phosphorus (P). This experiment will identify new loci that could enable the plant to use less nutrients for the same root biomass output. 3 Our first work was to determine an experimental treatment space of N and P levels. We carried out a pilot experiment with two ecotypes that we had previously found to show large differences in root architecture and size, and N-responsiveness. Col-0 and Kas-2 ecotypes were grown on agar under 25 different combinations of levels of N (5 levels of KNO3, from 30M to 5000M) and P (5 levels of H2KO4P, from 6.25M to 625M), with and without 30mM sucrose in order to ascertain the best conditions in which to carry out the large scale experiments with all the ecotypes. After 12 days growth on agar, the fresh mass of the plants was recorded and the root size and architecture of each plant was analysed by scanning each plate and analysing the roots using Fig. 2. Kas-2 and Col0 seedlings grown the software package EZ Rhizo (http://www.ezvertically on agar plates containing 625µM H2KO4P, 500 µM KNO3, and 30mM sucrose rhizo.psrg.org.uk/). Root system measurements have included: primary root length, lateral root lengths and numbers, and lateral root density. Even without detailed analysis, obvious differences between the two ecotypes are evident, particularly at high concentrations of P combined with middling levels of N (e.g. see Fig.2). The full scale experiment involves growing all 96 ‘Nordborg’ ecotypes on 3 combinations of N (KNO3) and P (H2KO4P) with 30mM sucrose for 12 days (62.5M P & 100M N; 625M P & 500M N; 125M P & 5000M N). At day 12 we have measured plant fresh mass and measure root size/architecture. After GWAM analysis we will then move on to analyse N and P efficiency in select lines and validate the best GWAM hits using phenotypic analysis of gene perturbations, initially using tDNA knockout lines (widely available for Arabidopsis). Associated Milestones: M2ii.c WUE, PUE, NUE and root development data for Arabidopsis GWAM studies collected (complete) Objective 3: To further validate and discover new QTL for PUE and WUE by association mapping in Brassica oleracea We will start this work in year 2 of the project. Our main focus so far has been communicating with scientists on the VeGIN project to ensure that Brassica oleracea seed is ready for use and that the VeGIN project to identify SNP markers is on target. 4 Appendix 1. Genotyping results matrix for BC1F3 lines derived from AGSL118. G (dark green) denotes detection of a GD33 allele; A (light green) denotes detection of an A12 allele; H (yellow) denotes both parent alleles detected; A/H or G/H denotes an ambiguous result; (-) denotes a missing value. The 6 new markers are highlighted in red text. The 5 BC1F3 lines chosen for WUE phenotypic analysis are highlighted in blue text. 5 Appendix 2. 96 Arabidopsis thaliana ecotypes selected to represent a more balanced diversity set for water, nitrogen, and phosphorus use efficiency. 6