Introduction.to.RCircos_1.1.2

advertisement
Circos: Circos Plot with R
Henry Zhang, Ph.D.
hzhang@mail.nih.gov
Genetics Branch
Center for Cancer Research
National Cancer Institute
http://Circos.ca
Circos has been used and referenced in many scientific publications (501 as of October 2013).
Circos plot of somatic mutations, copy
number variations, transcriptome
expression, and structural variations.
From inside to out, structural variations
(purple and orange), copy number
variations (gain in dark red, loss in dark
blue, mRNA expression (up in gold,
down in olive), differentially expressed
microRNAs (up in red, down in green),
DNA methylation with sky-blue
background (up in dark orange, down in
chartreuse), somatic mutations with a
gene symbols, and chromosomal
cytobands.
Kim SC, Jung Y, Park J et al.2013
A high-dimensional, deep-sequencing
study of lung adenocarcinoma in female
never-smokers PLoS One 8:e55596.
> /usr//bin/CIRCOS/circos-0.60/bin/circos -conf mycircos.conf
Circos plot: extra procedures are needed to prepare data
files and configure file from dataset(s)
hsY 0 49999 0.00804
hsY 50000 99999 0.0033
hsY 100000 149999 0.0084
hsY 150000 199999 0.00316
hsY 200000 249999 0.007
hsY 250000 299999 0.00466
hsY 300000 349999 0.00636
hsY 350000 399999 0.00576
hsY 400000 449999 0.00678
hsY 450000 499999 0.00688
hsY 0 49999 402
hsY 50000 99999 165
hsY 100000 149999 420
hsY 150000 199999 158
hsY 200000 249999 350
hsY 250000 299999 233
hsY 300000 349999 318
hsY 350000 399999 288
hsY 400000 449999 339
hsY 450000 499999 344
<<include ideogram.conf>>
<image>
<<include etc/image.conf>>
</image>
.
.
<plots>
<plot>
file = data/6/snp.density.50kb.txt
color = black
r0
= 1.075r
r1
= 1.15r
.
.
</plot>
<plot>
file = data/6/snp.number.50kb.txt
color = black
r0
= 0.85r
r1
= 0.95r
.
.
</plot>
</plots>
.
.
.
R Bioconductor package: ggbio
•
•
•
•
•
(Yin, T. et al: Genome Biology 2012, 13:R77)
Requires ggplot2, biovizBase, and GenomicRanges packages
All data are held with GRange object
All plots are wrapped into one function.
Plots are arranged in layers
Parameters must be correctly selected
layout_circle() in ggbio package: a complicated function with some unimplemented plot functionalities
layout_circle(data, ..., geom = c("point", "line", "link", "ribbon",
"rect", "bar", "segment", "hist", "scale", "heatmap", "ideogram",
"text"), linked.to, radius = 10, trackWidth = 5, space.skip = 0.015,
direction = c("clockwise", "anticlockwise"), link.fun = function(x, y,
n = 30) bezier(x, y, evaluation = n), rect.inter.n = 60, rank, ylim =
NULL, scale.n = 60, scale.unit = NULL, scale.type = c("M", "B",
"sci"), grid.n = 5, grid.background = "gray70", grid.line = "white",
grid = FALSE, chr.weight = NULL)
An R package implementing Circos 2D track
plot
Implemented in pure R and relies on only R
packages that came with R base installation.
Provides a set of graphic functions and each
plot type, such as scatter, line, histogram,
heatmap, tile, connectors, links (lines and
ribbons ), and text (gene) labels, has its own
function .
Use R low level R graphics functions only.
Availability: http://www.r-project.org
Reference: Hongen Zhang, Paul Meltzer and Sean Davis
(2013): RCircos: an R package for Circos 2D track plots.
BMC Bioinformatics 14:244
easy to use
easy to use
Basic steps to generate an RCircos plot image:
1.
2.
3.
4.
5.
6.
7.
Load RCircos library
Load chromosome cytoband data
Setup RCircos core components
Load input data
Open graphic device
Call specific plot function to plot each data
track (chromosome ideogram, gene names,
heatmap, scatter plot, ….)
Close the graphic device if it is image file
(Graphic device could be R GUI, Tiff, PNG, and
PDF files).
supports multiple ideogram data
use data frame for input data
Input data format for heatmap, histogram, scatterplot, line plot, gene label, connector, and tile:
Input data format for links:
use data frame for input data
easy to integrate with other R graphic functions
Mouse and rat chromosome ideograms, heatmaps, and link lines are drawn with RCircos with two
input datasets. Title, legend, and color key are added with function calls of R graphics package.
Step 1: decide how many data tracks and
where you will plot them
RCircos follows the layout paradigm set
forth by Circos and arranges data plots by
tracks.
The core track is the chromosome
ideogram track with highlighting and
labels.
Data plot tracks can be placed inside or
outside of chromosome ideogram track.
Step 2: Load cytoband data and setup RCircos
core components
Plot with all chromosomes:
If plot only some chromosomes:
Step 3: modify plot parameter if necessary
Step 4: load input datasets
Step 5: open graphic device
Types of graphic devices than Rcircos supports: R GUI, files of png, pdf, tiff
png(file="RCircos.Demo.Human.png", height=8, width=8, unit="in", type=“windows", res=300);
tiff(file="RCircos.Demo.Human.tif", height=8, width=8, unit="in", type=" windows ", res=300);
pdf(file="RCircos.Demo.Human.pdf", height=8, width=8);
(If using image files, graphic device must be closed after plot is done).
RCircos.Set.Plot.Area();
or if you want control the size of plot area
par(mai=c(0.25, 0.25, 0.25, 0.25));
plot.new();
plot.window(c(-1.5, 1.5), c(-1.5, 1.5));
Step 6: plot chromosome ideogram and data tracks
RCircos.Gene.Connector.Plot(gene.data,
track.num=1, side="in")
RCircos.Chromosome.Ideogram.Plot()
RCircos.Heatmap.Plot(heatmap.data,
data.col=6, track.num=5, side="in")
RCircos.Gene.Name.Plot(gene.data,
Name.col=4, track.num=2, side="in")
RCircos.Tile.Plot(tile.data,
track.num=9, side="in")
RCircos.Line.Plot(line.data,
data.col= 4, track.num=7, side="in")
RCircos.Link.Plot(link.data,
track.num=11,
by.chromosome=FALSE);
RCircos.Histogram.Plot(hist.data,
data.col=4, track.num=6, side="in")
RCircos.Scatter.Plot(scatter.data, data.col=5,
Track.num=8, side="in", by.fold=1)
RCircos.Ribbon.Plot(ribbon.data,
track.num =10,
by.chromosome=FALSE, twist =FALSE)
RCircos core component: RCircos plot parameters
track.padding
track.out.start
highlight.pos
track.in.start
chr.name.pos
track.height
chr.ideog.pos
chrom.width
chrom.paddings
RCircos core component: RCircos plot positions
RCircos core component: RCircos plot ideograms
Customizing heatmap plot colors
Customizing other plot colors by appending
A column of color names to input dataset
Add more decorations
Adjust plot range
Download