Running VisANT to make pictures for modules

advertisement
Step 1: Install Java on your computer
Step 2: Download VisANT
There is also a web-browser version that is apparently more popular, but that I have
never used, as well as some tutorials that can be found at the main site.
Step 3: Open VisANT and read in some data
A) First, open this file (in the future, you can substitute your own file). You will
notice that the data is in a certain format:
Column 1 – Name of first gene in the interaction/co-expression/etc. pair
Column 2 – Name of second gene in pair
Column 3 – A number representing the width of your edges (I think this can range from
0-1, but I always use 0).
Column 4 – A vector for the line color (let's use M1003… I don't remember where to find
the conversion from character string to colors)
Column 5 – Your value of interest, sorted descending. Typically, this is topological
overlap, correlation, adjacency, number of interactions, etc. I think it
has to range from 0-1 as well.
(Columns 6+ will not be used.)
B) Next, you are going to copy and paste a subset of these data into VisANT.
If you have fewer than ~250 interactions (or ~500 reciprocal interactions), then you can
use all of them; otherwise, your picture will be too cluttered. (This is why the file is
sorted descending… so we can only copy the top N rows). In this case, we will include
the best 300 reciprocal interaction. To do this, copy the first 301 rows (EXCLUDING the
header row) and the first 5 columns…
…and paste it into VisANT in the lower left-hand corner box (see arrow below).
click add (circle below). Your screen should look something like this:
Then
Step 4: Quick formatting and labeling of the data
Congratulations, your data is now in VisANT! Now we need to make it look good. The
first thing we are going to do is to adjust the colors and sizes of the lines and edges.
A) Select all of the nodes either using your cursor, or by clicking in the main box
and pushing CTRL-A
B) Then click on Node(s) / Properties
C) From here, you can adjust the color and size of the nodes. You can also add
layers, and choose their size and position. We will adjust the settings as shown
below; then click Ok. (Make sure the Expansion and KEGG buttons are unchecked.)
Your nodes should now be black and have small labels below them (that you probably won't
be able to read very well), like this:
D) We now want to make the lines thinner… to do this, click on options, and uncheck
"Fine Arts". Your plot will now look like this:
E) Now, let's make the plot layout look more reasonable--we want to be able to
actually see what is going on! To do this, click on Layout, and then choose one of
the three "Relaxing…" options. I prefer "Elegant Relaxing…" but it is mostly a
matter of personal preference.
Your nodes will now jump around for a while, but will eventually settle down into a
steady state (after a few seconds). At this point, click on the "Stop Animation" button.
[In some cases you will have more than one group of genes. In this case you have two
options: 1) delete all of the genes that are not part of the main group (this is what I
usually do) and follow the steps below; or 2) drag the two groups near each other on the
screen, click "Fit to Page", then follow the steps below.]
Step 5: Label your hub genes
As you can see from the plot, there is at least one (CREB3), and possibly a few more hub
genes (i.e., genes with many connections to other genes). To highlight these genes in
the graph, we are now going to increase the size of the corresponding nodes and labels.
To do so, we click on Topology / Global Statistics / Degree Distribution
The box below will pop up. To highlight all nodes with at least k connections (in this
case, 13), click and drag your mouse across the part of this plot that includes all of
the orange dots with Node Degree (k) >= N. You can see that box highlighted below.
When you release your mouse, you will see circles around the relevant orange dots. When
you then click the X in the upper right of this window to close it, you will see that a
few of the nodes are highlighted. Do NOT click anywhere in the main window! Repeat Step
4 B-C above to increase the size of these nodes and their labels from 16 -> 25.
Step 6: Make sure all of your labels are legible
This is the most annoying part. Repeat the following steps until all
legible.
A) Manually drag the nodes from the edge of the screen by clicking
dragging the mouse until they are closer to their corresponding
case, we will move the nodes from the bottom left of the screen
of your labels are
on them and
hub gene. In this
closer to CREB3.
B) Click on "Fit to Page". This will make the plot fill your workspace. You can
further increase your workspace size by maximizing VisANT to fill your entire
computer window.
 You should probably save your workspace about now just to be safe!
 You can also highlight several nodes at once by holding the mouse down and creating
a box around them. Then you can click drag one node (as in A) and they will all
move. Your window will now look something like this:
Repeat the above steps a few times, and then you should be able to make all of the labels
visible and make the plot look just how you want it by making only minor tweaks of the
nodes.
I should point out that this is mathematically a legal thing to do because the positions
of and distances between nodes does not mean anything. Information is only contained in
the actual connections.
Eventually you will have a plot that looks something like this:
at which point your plot is done.
Step 7: Saving your plot.
You may have already saved your work in VisANT (if not, you should do that now). This
file will allow you to open your plot again in VisANT in case you need to make any
changes, but will not allow you to view the plot in any other program or to include it in
a publication. To save the actual image, you have two options:
1) Save as the Image: This is the easiest way to save your image. It will result in a
relatively low resolution png or jpg file. I would recommend using this method if
you are making the plots for yourself, but not if they are going to be in a paper.
2) Export as SVG File: This method will give you an SVG file, which can then be
imported into another program (like Adobe Illustrator) and saved as a high
resolution image. This method is advantageous in that you can end up with a high
quality, change the fonts, move things around, etc.; BUT it requires that you know
how to manipulate SVG files.
________________________________________________________________________________________
Step 8 (optional): Showing only your top interactions.
There are many other things that you can do with VisANT, and many modifications that can
be done to the above steps, but the above steps will get you your basic interaction plot.
Your best bet is to just play around with the program and to read the tutorials. That
said, there are a few other useful things that I want to highlight.
Sometimes you only want to include edges for a subset of your interactions (I'll give one
example below). Recall when you originally input your data, the 5th column was sorted
descending:
The
you
the
the
numbers in this column can be used by VisANT to filter your connections. To do this,
need to adjust the "Weight Cutoff" (blue arrow). Usually I will enter a number for
lower weight that is pretty close to what I want, and then toggle up and down using
arrow keys (red arrow).
In this case, by setting the cutoff to 0.4, we can see that most of the interactions
between singleton genes and CREB3 have lower topological overlap than the interactions
between the hub genes:
Once you have this image, you should save it (see Step 7). This way you can have
identical plots both with and without all of the connections to display side by side.
So why would you want to show this? In some cases you want to see how well preserved a
module from one data set is in a second data set. Take this example file:
Here we have sorted based on the topological overlap in data set 2, but will enter the
topological overlap from data set 1 as our fifth column in the VisANT. Often this
results in a plot where most of the connections for one of the hubs are preserved, but
most of the other connections are gone. In this case, you could say that the first hub
is "differentially connected" between data sets, which is often biologically interesting.
THE END!
E-mail me if you have any questions: Jeremy Miller (jeremym@alleninstitute.org).
Download