Genes and metabolites to phenotypes: QTL mapping of glucosinolate production in Arabidopsis A major quantitative metabolic trait: • Specialist insects like glucosinolates - attractant • Generalist insects hate glucosinolates Fluctuating selection and genetic constraint • Positive selection and negative selection bounces around by year • Force glucosinolates to occupy middle ground but also be diverse to change Glucosinolate Biosynthetic Pathway A Tryptophan CYP 79F CYP 83A1 C-S Lyase GST F11 SGT 74B1 ST5C ST5B Elongated Methionine S G lc S Methionine N OSO3- 3-methylthio C 2-oxo acid FMO O S 3-malate derivative S Glc N O SO3- 3-methylsulfinyl 2-malate derivative AOP2 Aconitase OH B S Glc S Glc N O SO 3- 3-hydroxypropyl N O SO3 - Allyl Glucosinolate Variation in total methionine-derived glucosinolate content 16 14 12 10 8 6 4 Accessions • Two genotypes recreate species? Cvi-1 Col-0 Bay-0 0 Shahdara 2 Ler-0 Total Aliphatic Glucosinolate (µmol mg-1) 18 It is possible to rapidly recreate the diversity in the wild with a single outcross • Genetic and phenotypic variance not linearly associated Glucosinolates are phenotypically constrained in the wild 16 Total Alipha?c Glucosinolate (μmol mg-­‐1) • 14 12 10 8 6 4 2 0 Set of accessions Genotypes 96 Bay-­‐0 x Sha 2 Ler x Col-­‐0 2 Ler x Cvi 2 0% Heritability 1 .0 0 0 .9 5 GLS 0 .9 0 0 .8 5 0 .8 0 0 .7 5 0 .7 0 0 .6 5 Metabolites 0 .6 0 0 .5 5 0 .5 0 0 .4 5 0 .4 0 0 .3 5 0 .3 0 0 .2 5 0 .2 0 0 .1 5 40% 0 .1 0 0 .0 5 Percent of traits Transcripts and metabolites have different genetics Transcripts 30% 20% 10% GLS Mapping glucosinolate content and regulation Glucosinolate Bay-0 Glucosinolate x Sha • Expression QTLs controlling transcript level of biosynthetic genes identified • Test for association between polymorphisms controlling enzymeencoding gene & resulting metabolites • self x5 F7 RILs • Controlling factors are a mix of enzymes and regulatory factors Regulatory connections can feedback from metabolism to transcripts • Cloning novel glucosinolate regulators QTL mapping of glucosinolate production in Arabidopsis • In Wentzell et al. (2007) the well-studied glucosinolate gene network was used to test the feasibility of an a priori-defined gene network approach to map QTLs and eQTLs in 148 Bay-0 x Sha RILs • Bay-0 and Sha are two Arabidopsis thaliana ecotypes (natural variants) - they are the same species but have distinct genotypes and phenotypes • A Bay-0 x Sha cross was made (parents that have divergent glucosinolate content), then recombinant inbred lines (RILs) generated. Each line is homozygous for a (different) mixture of Sha and Bay-0 genes • In each RIL: measured ~48 glucosinolate metabolites measured ~95 genetic markers measured expression of ~25k genes with microarrays (for eQTLs) • (1) Used QTL-mapping methods to identify parts of the genome (genes) that were associated with different levels of the different glucosinolates (2) Used eQTL-mapping methods to identify glucosinolate-regulatory networks • Tryptophan Elongated Methionine CYP 79F CYP 83A1 GST F11 C-S Lyase SGT 74B1 ST5C ST5B S G lc S Methionine Glucosinolate N OSO3- 2-oxo acid 3-methylthio Glucosinolate GSL.Elong Biosynthetic QTL Pathway FMO O S Glc S 3-malate derivative GSL.OX QTL 2-malate derivative N O SO3- 3-methylsulfinyl AOP2 Aconitase OH S Glc Results suggest that natural variation in transcripts may significantly impact phenotypic variation, but that natural variation in metabolites or their enzymatic loci can feed back to affect the transcripts N O SO 3- 3-hydroxypropyl GSL.ALK QTL S Glc N O SO3 - Allyl Tryptophan Elongated Methionine CYP 79F CYP 83A1 GST F11 C-S Lyase SGT 74B1 ST5C ST5B S G lc S Methionine Glucosinolate N OSO3- 2-oxo acid 3-methylthio Glucosinolate GSL.Elong Biosynthetic QTL Pathway FMO O S Glc S 3-malate derivative GSL.OX QTL 2-malate derivative N O SO3- 3-methylsulfinyl AOP2 Aconitase OH S Glc • Analysis of Arabidopsis natural variants N O SO 3- 3-hydroxypropyl GSL.ALK QTL S Glc N O SO3 - Allyl detected several QTLs (-> identify genes) affecting glucosinolate content • mRNA levels for these genes should be different in these ecotypes • We will test their predictions by extracting RNA and testing with Quantitative PCR Quantitative PCR principles - a type of PCR reaction that enables detection and quantification of gene expression - use gene-specific primers to amplify your gene of interest - the product is fluorescently-labelled by incorporating a dye into the reaction; the dye tissue extract RNA binds to the newly-synthesised dsDNA - measuring how much fluorescence tells you how much dsDNA there is - fluorescence is measured at the end of each PCR cycle (see graph on next slide) - often called real-time PCR since the data generated shows progression of the reverse transcribe cDNA reaction over time (rather than just at the end as you have for PCR) - the amount of dsDNA produced is dependent on how much cDNA was present for your specific gene, and thus how much of your gene was expressed in the starting RNA sample QPCR Quantitative PCR output graph: • QPCR cycle number on the x-axis, fluorescence level on the y-axis. • To compare the fluorescence of different samples (e.g. 1-4 below), the cycle number at which the lines cross an (arbitrary) fluorescence cutoff (threshold cycle) is determined... crossing points: 1 27.3 2 27.1 3 28.0 4 28.5 -­‐ control -­‐-­‐-­‐ 1+2 = more mRNA of gene being detected since the cutoff is crossed at a lower cycle number 1 2 3 4 crossing cutoff 5 = nega?ve control QPCR standard curve • ... and then compared to a standard curve that tells you what the threshold cycle (y-axis below) will be for a given starting amount of gene expression (x-axis below) The standard curve will be different for different genes since PCR products for genes will be amplified at different efficiencies. WHY? Have to normalise between different samples • The QPCR reaction will give you a threshold cycle for your gene which you can compare to your standard curve to find out how much of your gene was expressed • To compare between samples you need to know that you started with the same amount of cDNA • To assess the amount of cDNA we use a control/ housekeeping gene (e.g. Tubulin) for which the level of gene expression is ALWAYS proportional to the amount of cDNA present - can t just rely on [RNA] values (esp. if the nanodrop didn t even work!!) • So you can normalise your gene expression quantification results against the Tubulin results to allow you to compare e.g. the amount of AOP2 expression between different Bay-0 and Sha samples SAMPLE 1 Primers designed to detect specific transcripts (genes) SAMPLE 2 Primers designed to detect specific transcripts (genes) QPCR Melt curve/peak: check that the primers are amplifying just one product (specific assay) - gives one peak Write-up for practical on QTL mapping of glucosinolate production in Arabidopsis • You ll write this practical up as if you had carried out this work: - a brief introduction (2 pages max) - explain the principle of QTL mapping and how it can be used in agriculture to find genes that control phenotypes (traits) of interest for crop improvement (find some examples of this) - explain why glucosinolates are a trait of interest to study - a brief methods section (2 pages max) - explain what the genotype and phenotype data that you used was, and where it came from (refer to the Wentzell et al paper) Write-up for practical on QTL mapping of glucosinolate production in Arabidopsis - results and discussion section: • What type of information can you find out about the QTLs that are detected in the analysis that you carried out? Why might the QTLs that you detect be different when you use different QTL mapping methods (Haley-Knott etc). (2 pages max) • Plot an eQTL LOD for AOP3. What does this plot tell you, explain what it shows in detail. Are there cis or trans (or both) effects? Which genes are predicted to regulate AOP3 expression (identify the genes from the LOD plots and use www.arabidopsis.org (TAIR) to search for information on them. List all of the genes that you think might control AOP3 and suggest which ones you think are most likely (2 pages max).