GENETICS genetic mapping, classical approaches to study gene function Basic aims: • uncovering gene function understand mechanisms of morphogenesis, development, metabolism, physiology etc. in connection with coordinated gene expression • breeding production of plants (organisms) with improved characteristics or their combination Terminology Gene • segment of genomic information that specifies a trait • basic unit of heridity in living organisms • Genotype + environment + ? = phenotype • Interactions between genes/proteins (epistasis – metabolic and signal pathways) Allele – form of a gene • dominant vs. recesive • genesis of new alleles by mutations Locus – location of a gene on a chromosome • Genetic linkage – inheriting of certain genes (their alleles) jointly, because they reside on the same chromosome (gene distance cM = % of recombinant gametes) • Genetic (likage) maps x physical maps Genetic (linkage) and physical maps differ - varying likelyhood of recombination – cM (0-50 cM) What sequences are with lower recombination probability? Genetic likage x crossing-over during meiosis 1.Cytologic event 2. Genetic result Parental chromosomes Parental Genotype (heterozygous Aa and Bb ) Locus A Locus B Meiosis Without Crossing-over Meiosis Crossing-over Gametes Gametes 1 3 2 4 Not recombinant Recombinant Not recombinant ( same as parental genotype ) Recombinant ( new ) Genetic maps - genes (identifiable) - markers (= any detectable feature with known position on chromosoms) Genetics classical (direct) x reverse Direct – from a trait (phenotype) to identification of corresponding gene Reverse – from a gene to phenotype (study of gene function by mutagenesis, modulated expression, …) - both approaches need mutants Mutagenesis Direct – looking for certain phenotype in mutant population Reverse – targeted mutagenesis/modification of selected gene • Classical: – chemical m. – EMS (ethane metyl sulfonate; point mutations) – physical m. – RTG, gama ... (usually short deletions) – wide spektrum of affects (regulation, interaction) – even dominant mutations, resamble natural mutations, difficult/expensive identification of mutated gene Mutagenesis • Advanced: – insertional mutagenesis – T-DNA, transposons – random insertions – allows simple determination of the site of insertion = mutation attached to a tag (inserted sequence) – various stratagies for gene isolation Gene isolation based on phenotypic change original gene caused by insertion Insertional inactivation - T-DNA tagging - transposon tagging Activation mutagenesis inserted sequence contain promoter or enhancer that can activate expression of adjacent otherwise inactive gene Promotor, enhancer-trap - T-DNA with reporter gene without promoter (with minimal promoter) selection based on reporter gene expression Identification of mutated gene Based on genetic map and segregation analysis mapping – determination of position of the mutation in genetic map by cosegregation with genetic markers (polymorphic between parental genotypes) Identification of mutated sequence – chromosom walking, sequenation, comparison with WT Identification of mutated gene (responsible for the mutation) Point mutations, short deletions 1) Based on genetic map and segregation analysis + chromosom walking, sequencing (long, expensive) 2) Using NGS (quick, moderate expensive) - even in unknown genomes!!! - mixed samples (back crosses) - comparisons of frequencies of similar oligomers Nordström et al. Nature Biotech.2013 Identification of mutated gene Insertional mutagenesis: • sequencing of flanking region (low template concentration for direct sequencing!) TAIL PCR (Thermal Asymmetric InterLaced PCR) adaptor PCR plasmid rescue iPCR TAIL PCR: SP1 SP2 SP3 AP SP1 AP SP2 AP AP AP SP3 AP 1. three PCR (optimized Ta) with specific primer SP1-3 + certain AP 2. product sequencing SP1-3: complementary to inserted DNA AP: arbitrary (degenerated) primer - several universal types, high P of anealing near insertion Adaptor PCR: E E E E SP1 SP2 SP3 SAP 1. cleavage (restriction endon., E) 2. ligation of adaptors 3. 2-3 PCR (spec. adapt. primer + spec. primers complementary to inserted DNA) 4. product sequencing Plasmid rescue: ori bla/nptIII E E ori bla/nptIII E plasmid 1. 2. 3. 4. 5. cleavage (E) circularization (ligation) transformation E.coli (ori, R) multiplication in bacteria sequencing Inverse PCR: E E E 1. 2. 3. 4. cleavage (E) circularization (ligation) PCR sequencing E Collections of insertion mutants - publicly available (Arabidopsis, rice, …) - insertions in different positions in genome – practically all genes (inactivation – 5’ exons, minimal promoter, confirmation by expression analysis necessary!) – mutant selection in silico, ordering seeds Gene1 1 Gene2 23 4 Gene3 56 7 = sites of T-DNA insertions in individual lines (1-8) 8 …line number WWW interphase http://signal.salk.edu/cgi-bin/tdnaexpress Direct genetics - selection of mutants by altered phenotype shootmeristemless agamous Mutant screens – phenotype, conditions, treatments, … The same phenotypic change can result from different mutations „there are numerous ways how to build up house incorrectly“ - allelic mutations – mutation in the same gene (x different g.) How to distinguish (recesive mutation)? Crossing of homozygous mutants F1 – wt = different genes (complementation) - mutant = allelic Direct and reverse genetics in Arabidopsis reverse direct Identification of mutation site + Tilling – „searching“ in non-characterized collection of lines by PCR and reasociation TILLING: detection of mutants with point mutations in certain gene Targeting induced local lesions in genomes • Principle: chemical mutagenesis (EMS) • PCR- and heteroduplex analysis-based screen • Point mutations! (changed regulation, interactions, …) TILLING 1. 2. 3. PCR of selected sequence from DNA stocks isolated from mutant population Reassociation with PCR fragment from wt plant Cleavage of ss sites of heteroduplex + electrophoretic separation of endlabelled fragments TILLING – strategy of screening Identification/mapping of unknown (mutated) genes(„with phenotype“) by cosegregation analysis Based on genetic map 1. mapping – genetic linkage with genetic markers (necessity of dense polymorphic markers!) 2. identification of the gene - chromosom walking - sequencing (sequence comparisons) Genetic likage x crossing-over during meiosis 1.Cytologic event 2. Genetic result Parental chromosomes Parental Genotype (heterozygous Aa and Bb ) Locus A Locus B Meiosis Without Crossing-over Meiosis Crossing-over Gametes Gametes 1 3 2 4 Not recombinant Recombinant Not recombinant ( same as parental genotype ) Recombinant ( new ) Basic set of genetic markers in Arabidopsis thaliana 2-3 in every chromosomal arm A, B – full linkage! A, C – free recombination AA bb cc F1 (heterozygote) A bc P1 (homozygote) gametes Aa bB Cc a BC gametes aa B B CC P2 (homozygote) gametes A bC F2 – full linkage: AB:Ab:aB:ab 2:1:1:0 AA a a a A b b BB B b A bc a BC a Bc Cosegragation analysis in F2 generation F2 – without linkage: AC:Ac:aC:ac = 9:3:3:1 AA Cc AA cc AA CC Aa Cc Aa cc Aa CC aa Cc aa CC aa cc Segregation in F2 generation (P=XXyy x xxYY, F1 = XxYy – frequency of gametes depends on the linkage) gamety XY (0.5) Xy xY xy (0.5) XY (0.5) XXYY XY (0.25) XXYy XY (0.5) XxYY XY (0.5) XxYy XY (0.25) Xy XXYy XY (0.5) XXyy Xy XxYy XY Xxyy Xy (0.5) xY XxYY XY (0.5) XxYy XY xxYY xY xxYy xY (0.5) xy (0.5) XxYy XY (0.25) Xxyy Xy (0. 5) xxYy xY (0. 5) xxyy xy (0.25) 9:3:3:1 (XY:Xy:xY:xy) x 4,75:2:2:0,25 no linkage = different chromosoms (arms) week genetic linkage Looking for strong linkage! Types of genetic markers = trait with known or identifiable position in genetic map with polymorphism between parental genotypes (e.g. different ecotypes) • Morphological (limited number) • Molecular – DNA markers – detectable differences in DNA sequence – isozymes Natural morphological variability of Arabidopsis ecotypes Morphological markers Gene symbol Name Phenotype Location (chr. cM) an-1 angustifolia narrow leaves, crinkled siliques 1-55.2 ap1-1 apetala no petals 1-99.3 py pyrimidine requiring white leaves, restored by pyrimidine 2-49.1 er-1 erecta compact inflorescence, blunt siliques 2-43.5 hy2-1 long hypocotyl elongated hypocotyl, slender 3-11.5 gl1-1 glabra no trichomes 3-46.2 bp-1 brevipedicellus short pedicels, siliques bent downwards, short plant 4-15.0 cer2-2 eceriferum bright green stems, siliques bent downwards, short plant 4-51.9 ms1-1 male sterile no siliques 5-2.5 tt3-1 transparent testa yellow seeds, no anthocyanin 5-57.4 Molecular markers in Arabidopsis DNA molecular markers (= usually an electrophoretic band) • RFLP (Restriction fragment length polymorfism) + Southern • RAPD (Random amplified polymorphism detection) • AFLP (Amplified fragment length polymorphism) • SSR (Simple sequence repeats) • SNP (Single nucleotide polymorphism) Cosegregation analysis with molecular markers • Crossing of different genotypes with high polymorphism (multiple differences in markers)!!! • Possibility of analysis of high number of markers at ones • Which marker A,B,C,D is linked with locus R? Fenotyp: r R fenotyp r fenotyp R Bulked segregant analysis • Strong linkage – possibility to analyze in bulk r phenotype r R phenotype R Examples of DNA molecular markers Known sequence and position in the genome • RFLP (Restriction fragment length polymorfism) + Southern hybridization Unknown sequence and position (randomly visualized sequences), sequence and position determined subsequently only for those in genetic linkage with a trait • RAPD (Random amplified polymorphism detection) • AFLP (Amplified fragment length polymorphism) RFLP RAPD AFLP Finding of two markers surrounding mutated gene „Chromosome Mutovaný gen X walking“ Libraries of big genomic fragments YACs, BACs = yeast (bacterial) arteficial chromosome, ~ 300 (100) kbp cosmids ( fág, 50 kbp) Looking for overlaps using hybridization Marker assisted selection (MAS) Molecular marker in strong genetic linkage with certain trait can be used for screening of hybrids instead of the phenotypic characterization Advantages: • Not influenced by environmental conditions • Screens of seedlings • Often simple and cheaper • Possibility to distinguish between homo- and heterozygots (using certain markers) Identification of genes by function (interaction) Yeast two-hybrid screen for protein interactors