DNA Subway - iPlant Pods

advertisement
Welcome to DNA Subway
Classroom-friendly Bioinformatics
DNA Subway
 Commonly used bioinformatics tools in streamlined workflows
 Teach important concepts in biology and bioinformatics
 Inquiry-based experiments for novel discovery and publication of data
DNA Subway
Red Line: Genome annotation
Red Line
• Analyze up to 150 KB of DNA sequence
• De novo gene prediction
• Construct evidence-based gene models
• Visualize genome sequence in browser
DNA Subway
Yellow Line: Genome prospecting
Yellow Line
• Analyze DNA or protein sequence
• Search plant genomes using TARGeT
• Explore gene duplications, transposons, and non-coding sequences not detectable in conventional BLAST
searches
DNA Subway
Blue Line: DNA barcoding, and phylogenetics
Blue Line
• Analyze DNA or protein sequence
• Analyze DNA Barcoding sequence to
identify plant, animal, and fungal species
• Generate phylogenetic trees and publish sequence to GenBank
DNA Subway
Green Line: Transcriptome analysis
Blue Line
• Examine RNA-Seq data for differential
expression
• Use High-performance computing to analyze complete datasets
• Generate lists of genes and fold-changes; add results to Red Line projects
Annotate Genome Sequence
Detect Genes and Build Gene Models
DNA Subway: Red Line
DNA Subway
Red Line: Genome annotation
Requires JAVA 6 or above
www.java.com
Check your web browser has java enabled
Log in to DNA Subway
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Task: Analyze a ~3KB sequence from Chromosome 1 of A.thaliana
 Create a project
 Detect all the genes present
 Import data from BLAST results and visualize in local browser
 Construct a gene model
 Verify gene model at Phytozome
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Create a Red Line project
DNA Subway
Red Line: Genome annotation
Red Line
1. Click the Red Square to begin a project
2. Choose Plant and select Dicotyledon
3. Select sample sequence Arabidopsis thaliana (mouse-ear cress) Chr1, 3.40kb
4. Name the project and click Continue
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Detect genes in the project sequence
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
5. Click Sequence to view the input
sequence
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
6. Click Repeat Masker
7. When View icon ( ) appears; click
Repeat Masker again to examine results
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Tip: Before gene prediction, RepeatMasker attempts to identify repetitive sequences such as low-complexity, simple
repeat, AT/GC-rich, or several types of transposons. Results are presented in a table. The Attributes column describes
what type of repeat was detected in the ‘description=‘ field
AT-rich sequence at 1667bps
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
8. Click 1 or more gene predictors
(Augustus, FGenesH, SNAP, tRNA Scan)
9. When View icon appears, click the gene
predictor again to examine the results
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Tip: de novo gene predictors predict genes within a given sequence. Each program is optimized differently; each
program’s results vary. The Attributes column details features that make up a single predicted gene (e.g. the whole
gene, mRNA, CDS, and exons). Sub-features are listed in the Type column.
Augustus predicts a single gene
(designated ‘g1’) with 4 exons
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Import data from BLAST results and
visualize in local browser
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
10. Click BLASTN to search and import
similar DNA sequences
11. Click BLASTX to search and import
similar sequences based on protein evidence
12. When the searches complete; click again to examine results
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Tip: BLAST results are derived from UNIGENE or UNIPROT databases, and contain experimentally derived evidence (e.g.
cDNAs) that can be used to infer a probable gene structure. The Attributes column has details on the sequence matches
that were found (e.g. gene name, GenBank IDs, etc.)
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
13. Click Local Browser to visualize results
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Tip: You can use the local browser (Gbrowse) at any time to visualize the results of any tool’s output.
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Construct a gene model
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
14. Click on Apollo to start the program
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
15. Hide the reverse strand; click the View
menu and select Hide Reverse Strand
16. Expand tiers; click the Tiers menu and
select Expand all tiers
17. If there are too many tiers displayed, click the Tiers menu; select Show Types Panel and uncheck
Show evidence you wish to hide
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
18. Double-click the Augustus model
and drag into workspace
19. Double-click the new temporary model;
right-click to open the Annotation info editor
20. Name the model ‘Augustus1’ in both ‘Symbol’ fields.
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
21. Double-click the BLASTN model
and drag into workspace
22. Double-click the new temporary model;
right-click to open the Annotation info editor
23. Name the model ‘BLASTN1’ in both ‘Symbol’ fields.
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
23. Zoom in to examine the 5’ and 3’ ends
of the gene models
24. Double-click the Augustus1 model and
right-click to open the Exon detail editor
25. Adjust the 5’ and 3’ of the Augustus1 model to match the evidence
provided by the BLASTN1 model
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Tip: The BLASTN evidence is most useful for determining the transcript length (e.g. the 5’ and 3’ ends).
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
26. Use any other available evidence*
(e.g. BLASTN, User BLAST(N/X) ) to
make alternative models if supported
27. Use the BLASTX evidence to determine start/stop codons. Drag any
needed stop and start codon into your model.
*If you have hidden evidence, show it again from the show types panel in the Tiers menu
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
28. Delete the BLASTN1 model and any
other extraneous models
29. Save your work back to DNA Subway;
click the File menu and select
Upload to DNA Subway; close Apollo
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Verify gene model at Phytozome
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Red Line
30. Click Phytozome Browser and compare
the created model(s) to the accepted
transcript(s)
DNA Subway
Red Line: Demo analysis – determine a structure for an Arabidopsis gene
Tip: Phytozome accepted transcripts are only available for DNA Subway sample sequences.
Download