E S S E N T I A L S O F N E X T G E N E R A T I O N S E Q U E N C I N G W O R K S H O P 2 0 1 4 U N I V E R S I T Y O F K E N T U C K Y A G T C 8 Class Integrative Genomics Viewer (IGV) Background The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including arraybased and next-generation sequence data, and genomic annotations. Helga Thorvaldsdóttir, James T. Robinson, Jill P. Mesirov. Integrative Genomics Viewer (IGV): highperformance genomics data visualization and exploration. Briefings in Bioinformatics 2012. James T. Robinson, Helga Thorvaldsdóttir, Wendy Winckler, Mitchell Guttman, Eric S. Lander, Gad Getz, Jill P. Mesirov. Integrative Genomics Viewer. Nature Biotechnology 29, 24–26 (2011) 8.1 Installing IGV Go to http://www.broadinstitute.org/software/igv/home Scroll down the page to “Downloads” and click on the “register” link. Type your information and click “Agree” On the left-hand side click on “Downloads” For this workshop we will be using the 1.2GB memory build of IGV. This allows IGV to use up to 1.2 GB of memory. If you find that IGV is running out of space (bottom right of the IGV screen) and not loading files properly, you will want to switch to using a larger build. Click the appropriate “Launch” button, and open the IGV .jnlp file you downloaded (i.e. igv_mm.jnlp) and allow it to run. You may want to click “Always trust content from this publisher.” Essentials of Next Generation Sequencing 2014 Page 1 of 4 8.2 Adding a Genome and Annotations Input(s): magnaporthe_oryzae_70-15_8_single_contig.fasta RNAseq/70-15_RNA_sample_1_thout/accepted_hits.bam genes/MAKER/maker-preview.gff or maker-annotations.gff Output(s): accepted_hits.bam.bai Tracks and annotations visible in IGV 8.2.1 Adding a Genome IGV can keep track of multiple genome assemblies. To load our information, we need to get it from the server and then load it into IGV as a new genome. To copy the genome from the server: Open WinSCP Connect to our server (128.163.192.150) with your username and password on port 22 In the left pane, Browse to your USB drive. In the right pane, browse to your assembly directory. Drag the magnaporthe_oryzae_70-15_8_single_contig.fasta file from the right pane to the left pane to transfer it to the local computer. Now, we will load the file into IGV. Open IGV In the menu, select Genomes > Load Genome from file... Browse to your Documents folder where you copied files from the server and select magnaporthe_oryzae_70-15_8_single_contig.fasta. 8.2.2 Adding Annotation Tracks In order to import BAM and SAM files into IGV, we will first have to index them with SAMTools. Open a shell and navigate to the folder containing your sample 1 TopHat run data. There, run samtools: samtools index accepted_hits.bam This will create the index file accepted_hits.bam.bai Now follow the WinSCP instructions in section 8.2 again, this time to retrieve the following files: RNAseq/70-15_RNA_sample_1_thout/accepted_hits.bam RNAseq/70-15_RNA_sample_1_thout/accepted_hits.bam.bai genes/MAKER/maker-preview.gff Then, return to IGV. Essentials of Next Generation Sequencing 2014 Page 2 of 4 In the menu, select File > Load from file... Open accepted_hits.bam Repeat this process to load maker-preview.gff; if your MAKER run has already completed, you can use your maker-annotations.gff instead. Results may not appear immediately, or if they do they may be difficult to read. If you cannot see the features, zoom in using the zoom tool at the top right. If you still cannot see features use the bar at the top to scroll around. If the features are too clumped together to see, right-click on the name of the track to display a context menu. You should see three options: collapsed, expanded, and squished. Click on expanded and you should find your data easier to see. Within the right-click context menu you can also do several other things such as: rename tracks, adjust heights and colors, etc. Different file types will have different options that alter the tracks. Descriptions of these options can be found at http://www.broadinstitute.org/software/igv/?q=PopupMenus#AlignmentTrack. 8.3 Creating Coverage Tracks Input(s): accepted_hits.bam Output(s): accepted_hits.bam.tdf, Correct coverage track displayed in IGV When you load the track, IGV will show a quickly generated coverage track. However, this will not be completely accurate. To load a more accurate coverage map we need to count the entries in the .bam file with igvtools. In the menu, select Tools > Run igvtools Select these options: o Command: Count (should be selected by default) o Input file: accepted_hits.bam o Output file accepted_hits.bam.tdf (will be automatically filled when you select your input file) o Genome: magnaporthe_oryzae_70-15_8_single_contig.fasta (should be selected by default) Click Run. Wait for ‘Done’ to appear in the box at the bottom of the screen, and then close the window. Back in the browser, right-click on the “accepted_hits.bam Coverage” track and select “Load Coverage Data” Essentials of Next Generation Sequencing 2014 Page 3 of 4 Select the accepted_hits.bam.tdf file which was just created. In the right-click context menu, you can also change the scaling for the coverage map. Log scale: Toggles logarithmic scaling for that track. Auto scale: Toggles the auto scaling function for a given track. 8.4 Getting Heat maps from a .bam file Input(s): accepted_hits.bam magnaporthe_oryzae_70-15_8_single_contig.fasta Output(s): accepted_hits.bedgraph Heat map displayed in IGV We are going to use the program genomeCoverageBed, which is a part of the Bedtools suite, to create a .bedgraph file to load into IGV and display a heat graph. Type: o genomeCoverageBed -ibam ~/RNAseq/7015_RNA_sample_1_thout/accepted_hits.bam -bg -g ~/magnaporthe_oryzae_70-15_8_single_contig.fasta > ~/accepted_hits.bedgraph Now use WinSCP to copy the accepted_hits.bedgraph file to the local machine you're working on. (see section 8.2 above) Load the accepted_hits.bedgraph file into IGV (see section 8.3 above). Right-click the “accepted_hits.bedgraph” track and change the type of graph to Heatmap. Citations Quinlan AR and Hall IM, 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 26, 6, pp. 841–842. Essentials of Next Generation Sequencing 2014 Page 4 of 4