natural group length colour width shape venation chromosome number stems leaves UPGMA ratio multivariate analysis number phenotype morphology crossability flowers embryology hairs genotype anatomy secondary chemistry parsimony RAPDs F-statistics monophyletic intron nucleus chloroplast bootstrap SNPs RFLPs gene PCR SSRs spacer Bayesian inference paraphyletic AFLPs microsatellites mitochondrion maximum likelihood Types of DNA Plants have THREE genomes: 1) Nucleus 2) Chloroplast 3) Mitochondrion T A C G C G T A Nuclear DNA Large size, ca 10x106 kb in flowering plants Linear arrangement, as chromosomes Inheritance biparental Recombination Chloroplast DNA Small, 120-220 kb Circular, usually with inverted repeat No recombination Inheritance usually maternal in angiosperms, paternal in gymnosperms Constant gene order in all green plants. atpE atpB rbcL large single copy region matK rpl2 rpl2 16S 23S 16S 23S small scr psbA trnH Mitochondrial DNA Animal Plant 14-26 kb 150-2500kb Circular, usually homogeneous among cells No recombination, inheritance maternal Set of different-sized circles, which arise from processes that interconvert between mother circle & subgenomic circles Mutation rates high at sequence level; substitutions Rapid evolution in gene order but slower at sequence level (ca x100 slower than in animals) No recombination, inheritance maternal Main sources of DNA evidence Control centres turn genes on & off 100 Genes 80 single-copy 60 multi-copy 40 20 code for proteins 0 Inter-genic spacers non-coding sequences between genes Introns non-coding sequences within genes Non-coding Coding transposons & retroviruses Gene structure upstream enhancer promoter TATA box exon 1 5’ UTR exon 2 intron 1 exon 3 intron 2 • Exons are composed of start, amino acid & stop codons. • Highly conserved regions. • Useful at higher taxonomic levels, e.g. genus & above. spacer 3’ UTR • Introns are non-coding regions within a gene. • Spacers are non-coding regions between genes. • Both potentially highly variable regions. • Useful at genus level and below, sometimes down to population level. Multi-copy genes: rDNA 5.8S 25S 18S IGS ITS2 sometimes problem with concerted evolution. Coding regions (nS) highly conserved ITS1 Tandem repeats: 100s to 1000s of copies. Nuclear genome: biparental inheritance. 25S 18S gene of soyabean shares 75% nucleotide homology with yeast. ITS & IGS regions highly variable. 18S IGS Making inferences from the data Gene trees vs species or organism trees often only two genes (or regions) studied [out of ca 25,000 genes present] Data from the different genomes may or may not be congruent each genome tells its own story, which may not be that of the whole organism Approaches Phylogeny reconstruction, systematics Sequencing Genepool & population level phenomena RFLPs ‘Fingerprinting’ RAPDs AFLPs Microsatellites Allozymes (protein products of genes) Sequence: electropherogram Phylogenetic systematics parsimony. Identifies tree with minimium number of mutations (character-state changes). maximum likelihood. Identifies tree that has the highest probability of producing the observed data, given a particular model of evolution. Bayesian inference. Like maximum likelihood but much more sophisticated. Hurts the brain! ALL TREES CAN BE TESTED STATISTICALLY!!! bootstrap jacknife decay index Genepool & population phenomena st 0.447 st 0.468 st 0.555 st 0.390 st 0.287 st 0.289 RFLPs Restriction Fragment Length Polymorphisms Use restriction enzymes to cut DNA at recognition sites (usually 6b long). Separate fragments on an agarose gel. Stain fragments with ethidium bromide & view with UV. Fragment patterns in hybrids nuclear DNA probe 4 12 7 9 5 enzyme 1 fragments AA AB BB 19 --- enzyme 1 12 7 enzyme 2 fragments AA AB BB --- ----- 14 --- --- 9 --- --- 5 4 --- ----- ----- ----- enzyme 2 Different patterns are the result of gains/losses of restriction sites or inversions. Co-dominant in nuclear DNA: good for detecting hybrids. RAPD Randomly Amplified Polymorphic DNA gel indiv A A B -- -- indiv B - arbitrary 10bp primers target sequences flanked by inverted repeat primer sites permits multiple annealing throughout all three genomes coding & non-coding regions; single- & multi-copy DNA inherited as a dominant (cannot distinguish htz from hmz) AFLPs Amplified Fragment Length Polymorphsims cut DNA with pair of enzymes: one rare cutter & one common cutter attach known DNA sequences to the products amplify products using the known sequences as priming sites rather like RAPDs but much more reproducible dominant inheritance Microsatellites (SSRs: Simple Sequence Repeats) (GA)7 pri. flanking GAGAGAGAGAGAGA pri. flanking GAGAGAGAGA flanking flanking pri. pri. (GA)5 Short (1-6bp), tandem repeats (10-50 copies) Mono- to tetra-nucleotides, e.g. (AT)n Random distribution assumed Primers designed for conserved flanking regions Variation in repeat number polymorphism Co-dominant inheritance Summary Type of study Type of DNA Preferred marker Gene diversity & breeding system nrDNA co-dominant markers: microsats, allozymes Genotype diversity, clonality, individuality nrDNA high resolution markers: microsats, RAPDs, AFLPs Population structure & gene flow nr or cp/mt DNA all Phylogeography (genepool structure) cp/mt DNA sequences, RFLPs Speciation nr + cp/mt DNA all Inter-specific hybridisation nr + cp/mt DNA microsats, allozymes, RFLPs, AFLPs Systematics (above sp. level) nr + cp/mt DNA sequences, (AFLPs)