Mitochondrial and Chloroplast DNA in Scaffolds Goal • Determine which scaffolds have mitochondrial or chloroplast DNA – Grape and Arabidopsis reference sets • Ideally somehow annotate scaffolds/Contigs Process • A lot of blast results • Program – Splits up blast results – Counts the number of times a specific scaffold appears • Store data in format that is editable in excel 454 Top 454 Mitochondria Hits 9 8 7 6 5 4 3 2 1 0 scaffold18185 scaffold02014 scaffold14997 scaffold22747 scaffold12638 scaffold23105 scaffold14449 scaffold19050 454 Top 454 Chloroplast Hits 8 7 6 5 4 3 2 1 0 scaffold07344 scaffold00103 scaffold15425 scaffold00515 Illumina Top Illumina Mitochondria Hits 9 8 7 6 5 4 3 2 1 0 Illumina Top Illumina Chloroplast Hits 10 9 8 7 6 5 4 3 2 1 0 ConsensusfromContig26966 ConsensusfromContig27000 ConsensusfromContig9008 ConsensusfromContig19940 ConsensusfromContig27060 Why I used the Top Most Frequent hits 8 7 6 5 4 3 2 1 0 What Next? • Possible additional feature is to pull out scaffolding sequences that give x hits • Annotation issues with geneious What we can do in Geneious • Scaffolds identified through blast and counting hits re-Blasted as queries against grape and Arabidopsis mitochondria DNA – Just took one of the scaffolds with the most hits and re-blasted • Some alignment with ORFs 1 scaffold against all mitochondrial genes for grape and Arabidopsis Notice 14 total hits… Different e-value? Program wrong? Scaffold/contig Mitochondial Gene How Accurate is Geneious ORF? Questions… • How do we want to use geneious? • Is further work really helpful? • Or good enough know these scaffolds as flagged?