>> Host: So thank you for coming to the... talks today. But I have one [inaudible] from the...

advertisement
>> Host: So thank you for coming to the second session. And you have very nice full
talks today. But I have one [inaudible] from the organizers.
If you present games then please bring your laptop during the coffee break and also
lunch break, please bring your laptop and then demonstrate your game near the poster
session so that people can use your -- try your game. So today and tomorrow please
bring your laptop and demonstrate your game, please.
And the first talk will be given by Ulrik Brandes. Thank you.
>> Ulrik Brandes: All right. That was -- now you should hear me better. This is joint
work with my PhD student Mirza Klimenta. And Peter Eades calls it a myth that
professors are important. This is based on the conjecture that there's an exponential
decay in knowledge compared to the next generation of students that you have.
And Mirza's my PhD student. He should be five times more knowledgeable than me,
according to this conjecture. And maybe here's some evidence to support the
conjecture. These are the last three thesises, PhD thesises that I had to review. This is
Mirza's that he just turned in. [laughter]. It's the most comprehensive work on MDS for
graph drawing that I have seen so far. And just to be sure, it's 11 point with small
margins, right? [laughter]. So he should really be giving the talk, but it was too difficult
for him to get here, so I will do that.
It's about MDS and I just briefly recall was the principal is. Given some pair-wise
dissimilarities between objects we're trying to represent them in some geometric space,
two or three dimensional, such that the distances in that space resemble the
dissimilarities that were input into the method. It's a whole family of possible
approaches. And the two main variants are shown here or are illustrated here.
In GD, the typical use of this method is that the dissimilarities that we input into the
method are the shortest path distances in the graph. But other distances are possible.
All right.
And of course, the target space is two or three-dimensional because we would like to
look at these drawings at some point.
The two main members of these family or the main distinction between different
members of this family is between classical scaling and distance scaling where classical
scaling was the first method proposed in this. This is based on spectral decomposition
so it's basically a spectral method and this one is a iterative method that tries to
optimize some objective.
You can see there's a qualitative difference between the results, qualitative not in terms
of quality but in terms of their really distinct, the kind of results that we obtained, local
details much better visible in this case, whereas large distances are represented very
well in this case with some degeneracies.
This is the comparison between the two. As I said, one of them is based on spectral
decomposition, the other one is based on iterative minimization of a quadratic objective.
For the spectral decomposition method one of the very good things is we have very fast
approximation methods.
For distance scaling this is not so much the case, even though there are speed-up
heuristics. There are for -- because it's a spectral method if the spectrum is not well
behaved, if we have multiple eigen -- higher multiplicity of the large eigenvalues there
can be degeneracies. And you've seen some of those degeneracies as well because
we're only representing a subspace.
For distance scaling we have local minima because of the iterative method it highly
depends on the initialization, and you've seen examples in the previous session that this
may be the case. Because this is spectral decomposition, it's also considered to be
very inflexible because we set up a matrix and then we look at the eigenvectors and
there's nothing we can do about the result anymore. Whereas here we can fiddle with
the objective function, we can change weights, we can introduce other elements into the
graph and so on. So it's considered to be more flexible.
In terms of the result, the -- well, the overall qualitative difference is that we see very
well the overall shape of the graph in this case because of this tendency to represent
higher -- larger distances more accurately, and we see more of the local details of the
graph and the distance scaling method.
So the conclusion of this comparison is that in practice, the typical way of using this is to
use classical scaling as an initializer to unfold the graph very well and then improve the
local details by running a few iterations of distance scaling. So as a simple way of
memorizing this, C is before D, right? [laughter].
What I'm going to do in this paper, what I'm trying to show is that we propose two
adaptations that address the inflexibility and the degeneracies of classical scaling.
Because it is such a fast -- can be approximated very fast, we can apply this to very
large graphs. But if we look at very large graphs, we still want them to look decent. So
we though there are ways to modify classical scaling in such a way that we can control
the output a little better than maybe if we apply it in the standard way.
Because it's about classical scaling just in a nutshell, what is the spectral decomposition
method. Our input have some dissimilarities. Typically they would be metric. But in
graph drawing they are because we are using shortest path distances. The goal is to
find coordinates represented as an N by D matrix where we're d-dimensional
coordinates represented as rho vector such that the input similarities, the shortest path
distances are represented well by the Euclidean distance in the target space. All right?
And note that distances are closely related to inner products because we can represent
the square of the dance as the sum of some inner products. So we can look at classical
scaling as the reverse or at the reconstruction task. Because if we knew the
coordinates we could compute all the inner products. From the inner products we
compute the distances. And then we have the matrix that was given at the input. Now,
we're trying to do the reverse not knowing even the existence of such coordinates.
So we set up an inner-product matrix using this formula which is derived from the other
or easy to recognize formula in such a way that one of the degrees of freedom is
resolved.
Because distances of course are translation invariant whereas the coordinates are not.
And the way the degree of freedom this result has an effect on how this is set up. In
this case, it's the centroid of the configuration that the placed in the origin of the result.
All right. It's not -- the details are not that important. It's just easy -- or just important to
see that we can easily set up this matrix by doing some computations on the input to
similarities.
And then we apply a standard method of decomposition. So this matrix is symmetric.
We can do spectral decomposition on this matrix and get the coordinates out of this. All
right? So this is what we do. We get the matrix of eigenvectors of B, the diagonal
matrix of eigenvalues and again the eigenvectors and then we cut this in the middle by
just taking the square root of the eigenvalues here for the D largest eigenvalues that we
use as the dimension of the result.
So not difficult technically. One -- two things to notice is that it is known that this
decomposition is the best rank D approximation of the original matrix B. So what we're
basically doing is we're optimizing an objective function that looks like this where the
arrow in the inner product is penalized. Right? So we're trying to optimize this function
implicitly. And because of that, we have a bias towards larger -- larger distances. This
is where this comes from.
So this degree of freedom of choosing the centroid as the origin is very important for
what we're doing. So let us look at this more closely. It has been noted many times in
the MDS literature but it has not been used for graph drawings so far. Let me just
characterize it a little bit. If we do -- so if we have input coordinates of the sine
dimensionality we're trying to reconstruct, so we're doing a full reconstruction in
changing the origin, resolving this degree of freedom in a different way corresponds to
rigid transformation. So we're just shifting the image around.
If we do low dimensional construction because the input may not be metric at all, then
this results in a distortion. All right?
If we pick any particular vertex of the graph as the origin, we can set up the inner
product matrix in this way, which looks almost like the previous one. Because the
previous one was the average of all vertex coordinates, and now it's only one particular
one. So in general, we can weight all of the vertices in way. So any convex
combination of the other vertex positions can be chosen as the origin. And all that
needs to be done is to set up the inner product matrix in a slightly different way. So
we're not changing complexity and so on, we're just choosing where we would like the
origin to be.
And the effect is seen here. So this is the centroid of the configuration and the standard
classical multi-dimensional scaling solution that you would get.
If we pick, for example, this vertex, the result is slightly different because we're
reconstructing different inner products. And if we pick, for example, the graph center,
so the average of all of the vertices in the center of the graph, the vertices with lowest
eccentricity, then the result would look that like.
All right. So we can control a little bit of the shape by focusing on one particular area.
And this is what the first contribution is that would -- with the suggesting method to do
focus and context within the framework of classical scaling. All right? And the
approach is fairly simple. The goal of the focus on context technique is to enlarge some
area, to have more detailed view of something that is too large to be represented on the
screen entirely so we focus on one area S but we would like to preserve the context
enough to orient the user in the graph. This is the main idea. What we do here is we
set the origin of the configuration to the focal point that we would like to -- like to
address. And then we change the way that the inner products are being set up.
And we change them in such a way that we have seen before that the inner products
are now expressed in terms of some particular origin. We just control the contribution of
the length of the distance from that origin and we control the angle between the two
position vectors. So what this does is we can enlarge an area by choosing R
appropriately, focus on all of the distances that are close to the focal area, and we can
also spread the context around by changing the exponent for the angle.
In comparison to previous methods, for example, for fisheye views of graphs which are
very typical, this is an input space method. We're changing the data that is input into
the algorithm rather than what is usually done in output space method where the
distortion -- the method introduces already there and then it is additionally distorted. All
right?
So one thing to notice about this particular fisheye method is that the context is fairly
narrow. Because we're enlarging some area in the [inaudible]. But by controlling Q we
can large the context or use the empty space around the context to show more of it in
the same drawing. All right? So it's one way to control it.
And I think one of the good things about it is that the distortion is happening only when
the method is applied and not -- we have -- we do not have these uncontrolled
interferences between the two distortions that I introduced into the drawing.
Because we have these two parameters, we can change -- we can use them differently.
So for the same graph we can put the focus here where the color indicates how much
the edges are distorted. Or we can put the focus here. So this is just a change of origin
to show that we can actually enlarge different areas.
But now if we change the parameters a little bit, we see that in this case the focal region
is enlarged and the context is slightly shrunk by just changing the parameters in this
way. And of course the graph has been chosen because things are easily observable.
In this particular graph we can supply this also to bad layouts. And the same thing
happens.
But of course if the layout is such as bad in the beginning, the focus on context does not
look that good on this graph as well, all right. Because it's not a good demonstration.
But it shows that putting the focus here, putting a focus there, putting a focus there
what's the same effect on any kind of layout. Sasha.
>>: Is there a reason why you chose R and Q the same?
>> Ulrik Brandes: No. That's for illustration here. It can be anywhere between O and
1. And here we see that when we choose them -- oh, no, we didn't choose them
different.
>>: [inaudible].
>> Ulrik Brandes: Oh, yeah, here it is different. It should be independent controls. But
maybe I didn't demonstrate it enough. All right. So that's one thing.
Because classical scaling can be approximated very fast this is an interactive method.
So I claim that the layout usually takes less time than rendering the graph, so we can -we can do this in an interactive setting where we change the two parameters or where
we change the focus to our liking.
The second -- no, not the second, the addition to this first method is that maybe we
have multiple foci that we would like to look at. In using the same type of projector that
we used for picking one of the origins, we can also pick different focal regions and then
combine the results. All right.
So we do a weighted combination of all of the inner products. Then we have to pick a
common origin or a common focus for all of them, which will affect the results. But what
you can see is that one focus here, one focus there, and if we combine them, then we
get these two enlarged areas as well. All right? But here it is a matter of where we
choose the common origin what the actual results will be pause of that translation
variance of the inner products. All right.
Now, the second modification I think is the more important one. So this one is
practically more relevant. I think this, the one that is coming now, is more important for
the quality of the results. Firefighter if we consider classical scaling as a reconstruction
task then a there are actual coordinates and we can rewrite the way that the
coordinates are produced into something that resembles more a projection. It is
actually known as principal component analysis, but I'm summarizing what it does.
We have the original coordinates and these coordinates are transformed by equalizing
the spread along the dimensions and orthogonalizing them. Never mind these are the
eigenvectors of the inner product matrix of the coordinates. Using this we get some
projection that has this effect.
It can be shown that this is the maximization of the distances between points. All right.
So we're spreading out the drawing in all directions as much as we can. This is what
would happen if we knew the coordinates. Now, what we're going to change is this
projection matrix. All right? Because the deficit of classical scaling really is that if we
have these high dimensional coordinates to maximize this term it is greedily selecting
the dimensions. This is what is the most important thing in this data transformation
operations that we would like to maximize the variance along the dimensions. But for
the drawing this may be very bad. So here the advisor |dimension that is picked is the
long versions through the graph. And then the second dimension is the second longest
X basically that we can find in this drawing. So this is the result. Even though there's
three dimensional configuration that defines the input is not very well visible in this type
of projection. All right? So what we'd really like to see is most of the graph.
And how can we do this? Well, we changed the objective to a weighted version have
the original objective by modifying the projection matrix. So instead of the eigenvectors
of the inner product of the coordinates we're using a weighted version of this, where this
is the Laplacian of some weighting scheme that I'm not going to detail out. So this has
been observed by Yehuda Koren and Carmel. And explained in detail that we're just
using it here to improve the projection that we get for the original coordinates. And this
is the result if we just take the adjacency matrix as the weighting scheme, compute the
Laplacian out of this and then enter this into the eigenvector computation, then this is
what we actually see of this 3-dimensional projection.
Because we're now maximizing the length of the edges in the graph, all right? The
weights filter out all of those pairs that are not connected. So now we're not connecting
-- not maximizing all the distances but only those distances between those that are
adjacent. This is why we see more of the edges in this projection.
Now, in our case, coordinates are not known in the beginning, all right in the more
dimensions we have in classical scaling, the better the representation is. So while our
new technique here is to start from the similarity matrix to do classical scaling, but in
higher dimensions than we actually have as a target, to have more of the details
represented in this higher dimensional space. But because most of us are not capable
of looking at 10-dimensional drawings we then apply weighted PCA to get to the good
projection of this drawing.
So we choose some edge, for example, based on where the eye can get this, based on
where most of the data is already represented, and then we do this type of projection.
For example, in the same way that I showed before, but to be sure that we're not
looking from a very bad direction that has lots of occlusions, we're adding a few random
dyads to stabilize the projection with respect to what we would normally get.
So just a few of those. And it can be shown, and that's in the paper, that this is again
related to the choice of origin. The choice of origin can be expressed in this framework
as well. This is the final slide, showing just one example. This is a typical 2D classical
scaling of one of the matrix market examples. This is what we get when we do this in 6
dimension to represent more of the details in that higher dimensional drawing. And
once we have the coordinates we choose a perspective that shows most of the graph.
And even though this may look very fuzzy, it actually shows in this detailed view that we
have all the differences between those nodes that would not be visible in the two
dimensional projection because they're only represented in higher dimensions. This is a
node that has half a million edges, the graph that has half a million edges. Using the
basically linear time approximation algorithm for classical scaling, this is done in
seconds on a regular laptop computer. All right? That's all I had to say. Thank you.
[applause].
>> Host: Eric.
>>: [inaudible] interactivity saying [inaudible] can you make a smooth transition from
one [inaudible] to another [inaudible] by adjusting the weight? Is it possible? It looked
like it's possible, but I ->> Ulrik Brandes: I haven't thought about it, but I think so. I think it's a good suggestion
that we change the weighting scheme for the unit vector that focuses on one vertex to
the unit vector that focuses on the other by just interpolating between them.
>>: Seems by just adjusting the two parameters, yeah.
>> Ulrik Brandes: Not the two parameters, but the choice of origin. So if we move the
origin from one node to the other ->>: [inaudible] geometrically but ->> Ulrik Brandes: In input space.
>> Host: Okay. Ben.
>>: That was a terrific presentation. I appreciate it. I have two concerns. One is the
fisheye representation. In the example you showed the zoom factor is only about two or
three, and most studies show that the spatial instability is damaging and that the
advantage only comes when you get five or more.
So have you gotten examples where the focus versus context has a zoom factor of at
least five and that really shows a value to the tool?
>> Ulrik Brandes: That's a very good point. We -- we basically wanted to demonstrate
that this method is more flexible than it seems to be. But of course that ->>: Magnification approaches or distortion based approaches are undermined by the
lack of spatial stability. So that was a concern.
The other about the multi-dimensional scaling, I'm still not convinced that it provides an
effective representation because the places -- the parts of the graph where the
representation is weak are not emphasized. So we don't know where to trust it and
where to doubt it.
Is there a way you see of solving that problem to make CS a better payoff?
>> Ulrik Brandes: So I think that CS is really -- it's good at representing it's finite
element measures because they have some geometric origin to them. If it's a graph like
in the beginning, we still would like to use this as the initialization. But if the initialization
is better, then the distance scaling will work better as well.
The representation arrow is something that we thought about representing graphically,
for example, by coloring or something.
>>: [inaudible].
>> Ulrik Brandes: We don't know ->>: Try another idea which maybe you thought of or have a solution for, which is can
you characterize a family of graphs for which MDS provides effective representations?
>> Ulrik Brandes: Yeah. That depends on the spectrum.
>> Host: Okay. So David?
>>: Okay. Thank you.
>>: So this relates to something you said at the start of the talk and then again just now
that you say we should be doing CS and then DS. When you try it with this method,
does the DS undo some of the differences that your weighting makes or is there a way
to make your choice of weights for the CS part to carry those weights over into the DS
space and see how it goes in.
>> Ulrik Brandes: I'm not sure whether it's easy to control what's happening. I would
think that we could limit the displacement that DS is actually doing in the end and
maybe we could actually change the weights in the DS. About it that's something that
needs to be explored.
>> Host: Steven.
>>: Why six?
>> Ulrik Brandes: Oh, because that's where the eigengap was largest. So any
additional dimension would not have added much more detail.
>>: So in general, you can look at the eigengap and choose to go to the 20 or 7 or ->> Ulrik Brandes: Yes.
>> Host: Okay. Thanks for giving a nice presentation.
[applause].
>> Host: And the second talk will be given by Lowell Trott. And the title of the talk is
Force-Directed Graph Drawing Using Social Gravity and Scaling.
>> Lowell Trott: So this was work done with Michael Bannister, David Eppstein and
Michael Goodrich.
So just a brief outline. We're going to cover a little bit of background. Briefing on
force-directed methods because we've been talking about that a little bit earlier. What
we mean by social gravity. What we mean by scaling. And then we'll look at some
examples in trees and drawings of social networks, with a brief foray into some things
that concern Lombardi.
So quickly what are force-directed methods, what we'll also refer to as spring
embedders? They're a method for drawing algorithm -- or an algorithm method more
drawing graphs that are fairly flexible. And a nice feature is that they require no domain
knowledge about the graphs being input.
They -- the model we're basing our algorithm off is -- uses primarily two simple forces, a
repulsive force between all nodes and attractive forces between adjacent nodes.
So some of the challenges that these methods face is that modeling a physical system
tends to have local minima that you can get trapped in and not produce the optimal
drawing that you would prefer. And modifying the sort of simple forces by adding
additional forces can make it even harder to get out of these local minima.
So what are we going to do? Let's add more forces, see what happens. So social
gravity is the source we're going to be adding to the simple set. The way this breaks
down is the force factor on a particular node is made up of a gravitational constant that
we'll talk about in a second. The mass of a node which we'll actually use social metrics
to define. And then that node's relative position in a relationship to the center of mass
of the graph.
So what do we mean by mass in terms of a social metric? Well, we're going to use
something causality centrality to define which nodes are important. Centrality is a
metric used in social network analysis to give some understanding of nodes or rather
the entities they represent in a social sphere, importance based on their structure, their
location in the graph structure.
So there's a few different definitions. We get into them here that we'll be referring to.
One is degree centrality. So pretty straightforward. Nodes degree centrality scales with
its degree, the idea being that if a node is higher degree and maybe a social context it's
more likely to get infected by an infection traveling in the network just because it's more
connected.
Closeness centrality. It's just a measure of how close a node is to all other nodes in the
network. So it's the inverse of the mean distance from a node to all other nodes. And
as you can see in the drawing to the right, as a node kind of feels more central, you'll
see the closeness centrality raise our heat map. It kind of goes from lower, cooler
colors to brighter reds. That will be important later.
And then finally betweenness centrality, which is just a summation of the fraction of
shortest paths going through a particular node.
So the idea being that if you're a single node, you consider all shortest paths between
pairs of other nodes in your graph and then some number of shortage paths between
those nodes will pass through you and some will not. So your fractional importance is
the number of those that do pass through you over the total number of shortest paths.
And then you just sum that over all possible pairs in the graph. And as you can see in
the example to the right the centrality of the top two nodes is lighter than the centrality of
the bottom center node because the shortest paths between pairs from the left to right
are split between -- or the possible shortest paths are split between those nodes,
whereas in the bottom example they're not.
All right. So what do we mean by scaling? The idea is that we couldn't from the onset
of our algorithm just input the gravitational force as a constant value and run the
algorithm. So if we look at a simple forest here laid out with a classic inverse square
spring embedder, you'll see that it has kind of separated a little bit.
Now, if we add our gravitational force, and you have to -- this is where I need audience
imagination. These are slides are PDF now, so they don't actually animate anymore.
But that graph over to the right that is object secured will slide over to the center. You
can see this graph the subcomponents of the forest have been pulled together using
gravitational forces.
Now this was done with a gravitational force that was consistently high throughout the
layout of the algorithm. And you'll see that it's a pretty decent drawing. We don't have
any crossings. The subcomponents were closer together, maybe easier to view. And
then in the next drawing we'll see that this is a drawing where the gravitational forces
were scaled up. So gravitation started at a very low value and increased over time until
we achieved a drawing that we thought was acceptable.
And then if we compare the two, you'll see that there's not many differences here. So
this is actually kind of a counterexample to what I'm about to tell you. But it works pretty
well in the small case. The gravity seems to not affect what we're doing, how that
gravity's increased.
But if we look at this example, a little bit of a larger forest, we'll see that there are
actually a lot of crossings. One, two, maybe 16. But this -- this forest, because it's a
little bigger and a little more complicated, wasn't able to cleanly break from itself before
we began a gravitational layout.
So in the next case you can see the same forest where gravity started out low and
slowly increased until the final drawing was achieved. And here we get a lot cleaner
visualization. So there is actually one crossing in there. I'm not going to tell you where
it is. You have to find it for yourself.
So how do we do our scaling? Well, we have this very implicated diagram here. Not
really. Because what we find our in algorithmic study is that how we tweaked our
scaling wasn't really as much important as that we did increase from a lower value to a
higher value. So we used this simple step function, a few iterations, increase gravity, a
few iterations, increase gravity, and so on and so on ideal we achieve something we
thought was a nice drawing.
What we found is that varying the iterations before gravitational increase or the size of
that gravitational increase didn't actually modify our results too much. Obviously there's
some work here to see the effects of a -- maybe a more complicated function on the
final layout.
And so now we'll look at some examples in terms of trees. So here's a tree drawn with
closeness centrality coloring. So the red nodes, as you might imagine, have a higher
closeness centrality value.
And when you see a drawing [inaudible] this tree kind of spreads out. And some of the
connections that the edges might be harder to determine cleanly. You get a kind of a
less angular resolution out at the extremities.
But when we draw this with a little bit of gravity, what we see is that our total area
coverage per node is decreased. So a higher node density and a better distribution or,
excuse me, a more uniform distribution of nodes.
Now, you do kind of sacrifice some of the I guess immediate contextual notion of where
nodes are in the tree. But if you pay close attention, you can see that there are no
crossings, and separate tree paths still remain fairly separate, and our higher centrality
nodes, because of that gravitational constant, are pulled towards the center.
So here we have a convenient coloring but if you were labeling these nodes with
something else not centrality, the nodes that you might prefer to learn about in a social
context, those with high centrality based on your decision of which metric, are in the
center of the graph. So those would be probably viewed more aptly by someone trying
to examine the data.
So here's another tree. We can look at this, this tree drawn as well and there would be
some nice animations here. But imagine -- imagine trees sliding in.
This is the same tree drawn with a few different types of centrality. So up in the left
here is degree centrality. This is closeness, and the far right is between the centrality.
So this is just to illustrate that although the centralities are somewhat related, they do
differ. But the structure is generally the same for these simple trees. Because
centralities have some -- share some notion, the structure of the final drawing tends to
be similar.
Now, here's an example of a much larger forest with a few interesting pieces. The
larger pieces trees of the forest are drawn towards the center because they contain
nodes with higher gravitational masses. And you still get, despite the number of nodes,
the connections, crossing free drawing.
So moving on to actual social networks, here's just a few examples. On the left you'll
see a social network that kind of looks like a tradition drawing. Maybe the nodes that
have high -- or long edge connections out are further away from the center than you
might hope based on their centrality value.
On the right, as you can see, those nodes are drawn in towards the center. And there's
a definite increase in angular resolution on the extremities.
Here's another tree drawn with closeness centrality. And finally one with betweenness
centrality. And as you can see, the increase in uniform node distribution really helps in
sort of distinguishing which nodes are connected. You kind of lose that in this region,
but you're able to understand that a little better on the right. So not to burn it out but
we've got a few more real world social networks. So this is an infection graph with
closeness centrality. And one more we betweenness centrality coloring.
So even as the nodes get more complicated, we -- or the graphs get more complicated,
we still gain an angular resolution and a little bit of understanding of what local node
placement.
So a brief foray into Lombardi. So hopefully most of you who were here last year heard
Olga who Mark Lombardi is. He's an artist who drew by hands conspiracy networks
with long angular -- or, excuse me, long arc edges. So on the left here we have an
example of his work. And on the right, a straight line approximation of that drawing.
And then so if we slide this over, we can see that same graph drawn with our algorithm.
Now, while we don't maintain the same node placement that Lombardi used when he
hand constructed this, we do gain some movement of the high centrality nodes towards
the center. These -- I don't remember if I said are conspiracy networks. So you might
actually want to know that the nodes more central to the conspiracy would be located at
the center of your network.
And then here's another example. And on the right that drawing with our algorithm.
And then because this is a force-directed method you can actually use one of the
results from last year to do a force-directed Lombardi drawing to augment the forces
used and add curvature to the edges.
So on the right you'll see some long curved path that you would hope to get from those
results. So our gravitational force actually works seemingly well with the Lombardi
force.
So now let's take a brief look at our algorithms in action. So if I can full screen this. So
here you see it -- oh, sorry. A tree being laid out. And as the gravitational force
increases you'll see those extremities spread and be drawn towards the center. And the
nodes reach a more uniform distribution. And obviously if gravity gets too high, you
may have caught it there, you get one crossing. So there is some limit to how high you
can take gravity before your drawing starts to degrade. Here's a small social network.
And so the pulsing that you see happening is actually the scale of the drawing
increasing. So as the gravity pulls the nodes together we're -- and the bounding box
shrinks, we're blowing the drawing up. So just to maintain visibility. But when you think
of relative position, you still get the benefits we're looking for.
So here in the forest you'll see that the standard forces actually repel the subtrees. This
is where that scaling comes really helpful. So this allows some of those crossings that
would get caught up to kind of cleanly than eliminated. And then the gravity begins to
draw the subtrees back in until you get the drawing you're going for.
All right. Great. That's all I have. Thank you very much.
[applause].
>> Host: [inaudible].
>>: Do you have a way to control whether the subtree of the trees have the forest
overlap in the ends? Because it looks like they were repelling each other.
>> Lowell Trott: Yeah. So they're repelling each other with the standard repulsive
forces. So one thing that we didn't get into too much is balancing those repulsive forces
with the gravitational forces. We kept the gravitational forces fairly light in the
comparison because we thought that the traditional methods seem to construct really
nice graphs on a regular basis. So we did want to overshadow that with a really intense
gravitational forest.
But, yeah, if you want the kind of intermingling you can reduce that initial scaling period.
So have stronger gravitational force at the beginning
>> Host: Other questions?
>>: Yes. So if [inaudible] showing circular outlines in the end, since most computer
screens or pieces of paper are rectangular is there a way to define gravity in a way that
you would get something rectangular?
>> Lowell Trott: We thought about that. I don't actually remember how the -- yeah,
maybe it's ->>: [inaudible] actually tried and [inaudible] L1 metric or something like that instead of -normal you put in a metric under the hopes that it would -- it just produced chaotic -[laughter]. But I think it would be very simple, and I didn't try this one yet, but to simply,
you know, get oval shapes so that you could maybe fit a wide screen monitor better by
scaling one of the components. That should work. I don't see [inaudible] to get squares
[inaudible].
>>: So [inaudible]y overall structure of the graph. Have you done any studies to see if
these outweigh the costs to [inaudible].
>> Lowell Trott: No, we haven't. But that -- I would love to see the results of the
spending.
>> Host: More questions? Okay. Let's thanks to Lowell again for nice presentation.
[applause].
>> Host: And the title of the third talk is Progress on Partial Edge Drawings. And the
talk will be given by Till.
>> Till Bruckdorfer: Okay. So I'm going to talk about progress on partial edge drawing.
It's joint work with Sabine Cornelsen, Carsten Gutwenger, Michael Kaufmann, Fabrizio
Montecchiani, Martin Nollenburg, and Alexander Wolff.
So first of all, what is a partial edge drawing? Consider an arbitrary graph with cluster
caused by many edge crossings. And one of the main question in graph drawing is to
reduce the number of crossings. The tradition way is to simply change the embedding.
But what we do is we break the edges so every edge becomes two pieces which we call
stubs. And now we shrink the size of the stubs.
So if you remove the middle half of every edge, we get a layout with only 23 crossings
compared to the over 400 crossings before. Can even reduce the number of crossings
to zero and -- yeah. This is a layout showing that.
And the motivation for this is a talk from last year given by Burch, et al. And he
evaluated partially drawn links for directed graph edges and considered two layouts for
edges, the tapered and traditional drawings of edges and considered variance of length
of these stubs.
And he wanted to know if such a drawing makes it okay to understand the graph. For
example, the question is there possible link to connecting the red and the green node?
And the users of the study must click some buttons yes or no and so on. And the
resulted was that for this question and when we shrink the size of the stubs the error
rate increased. But if we consider another question like pointing out the node has the
nicest degree then we can see with lowest stub size the error rate decrease.
So and this is very interesting because at the length of 75 percent there is a dip so that
most motivated has to introduce a formal definition for partial edge drawing. So what
we do now is partial edge drawing is defined where every edge is replaced by two stubs
incident to the start and the end vertex and the stubs are not allowed to cross.
So if you ask now which graph emit such a drawing, the answer is any graph because
you can draw the stubs sufficiently smaller. To make the question more challenging, we
introduce other properties. Consider the complete graph with 6 vertices. Then we can
introduce a symmetry that means both stub of every edge has the same length and we
can introduce homogeneity. That means the ratio of the stop with edge length is the
same over all the edges. In this case, we removed the quarter of every edge. And, of
course, we can combine properties to a symmetric homogenous partial edge drawing.
This is an interesting drawing because if the ratio of the stub is known and you can
guess from the length of the stub and the ratio where the corresponding N vertex is.
And so we consider those drawings with a given ratio. In this case we have the ratio of
a quarter. That means one stub. And it's a quarter of the total edge length. And we
also consider the case of symmetric path edge drawings where we maximize the sum of
the stubs.
So what I'm talking about is first we show that a quarter SHPED does not exist for a
complete graph that's more than 212 vertices. Second I show some existence of delta
symmetric homogenous partial edge drawing for some classes of graphs. And finally,
we consider the minimum SPED which has a 2-approximation. A minimum SPED
corresponds to the maximum SPED if you minimize the gaps instead of maximizing the
stubs.
Okay. First of all, we require no embedding and fix the ratio to a quarter to make it
more easy now. Been there is a result from the previous work that the complete graph
is at most 16 vertices. It [inaudible] quarter SHPED. And a drawing of this looks like
this as you see on the left side. And it seems to be impossible to add a further vertex in
this drawing without any crossings. But what we can only prove is that for 212 vertices
it's not possible to draw a quarter SHPED.
And our first deal was we considered points with 30 to 50 points and [inaudible] says
that then we have 17 points on the half side of convex curve and for those points we
can easily show with the following techniques that such a drawing does note exist but
the number of points is very huge. So we reduced the number with taking a closer look
to the point-set.
So if you consider an arbitrary point-set you choose two points with a maximum
distance and rotate the points so that there's a line connecting line is horizontal
[inaudible] a rectangle in closing all the points. And now we consider just the top part of
this rectangle.
Here we have three points, a left, right, and top point having stubs and they define
partition of this rectangle with a property that every point in such a set has three stubs
crossing the center boundary.
And now we compute for every cell of these 26 cells how many points fit in these cells.
And as an example, we consider the red sign cell where we have one point -- at most
one point with a property that it has three stubs crossing three different segments of the
boundary.
And the further point has two stubs crossing one segment of the boundary. And from
the slope of the top stub of the lower point and the size of the stub, the left stub of the
upper point, we can compute that there are only two points possibly in this set.
And so we do for all the cells and get a total number of 107 points in the top part of the
rectangle. Now, if we double this number and take into account that the left and the
right point are counted twice, we get this number of 212 points.
Okay. Now, we come to the existence of a delta SHPED for some classes of graphs.
And the key concept is that we enclose every vertex and its stubs by a circle, and if we
know that the circles don't intersect, we know the stubs do not intersect. And although
there might be drawings where circles do intersect but the stubs don't intersect.
So if we consider some vertices with a certain distance of two connected vertices, we
see that one unit of distance covers two times the radius of such a circle. So we can
compute from the distance the ratio of a stub so that it is contained in the circle. And in
this case, for a distance D, we have one over 2D as the ratio.
If we consider now graph with bandwidth K that are graphs where we can order the
vertices along a line and if there are vertices connected, the distance of these vertices is
at most K, then we are sure that there is a 1 over 2 square root of 2K SHPED, and that
way that we arrange the vertices in a snake-like fashion and compute the maximum
distance.
And this is square root of 2K in this case. And by our first consideration with the key
concept we can find this square root of 2K in the ratio.
Now, for K-ciruclant graphs, we can do similar consideration. K-circulant graphs are
graphs where we can order the vertices on the boundary of the circle and every vertex
is connected to at most K left or right vertices. So we take the same snake-like fashion
and draw it on an anulus and if you compute the distance we can get again a value of 4
square root K replacing it in the key concept formula, we can prove this.
Now, we can change the concept. Instead of circles we can consider rectangles. And
I'm sure that they don't overlap. For considering bipartite graphs we considered just
one set of vertices, which we place side by side and can compute from the ratio the
number of vertices in the row. And we can also compute the number of vertices if we
place the vertices of one set on top on top. In the way that if we've placed one vertex
the other one is upper bound by the value of one minus delta and we can iterate this to
the lower bound of delta.
So if we see in the formulas the value of one over delta we know there's a side-by-side
placement of vertices, and if there is a value with logarithm we know there is a
top-on-top placement of vertices.
So if we consider now the complete bipartite graph on 2K plus N vertices, we see there
is a bound for K with a logarithm so there is a top-on-top placement for the construction.
As you see in the top and the bottom of this. And the remaining N of vertices are placed
on a horizontal line.
In this case if we have a ratio of a quarter, then this K is bounded by 4.8. And we can
also consider the complete bipartite graph the 2N vertices and there both of the vectors
appear. So that means we have side-by-side placement in the column and top-on-top
placement of the columns looking in the horizontal direction and both on each side of
this drawing.
[inaudible] to have no overlappings of the edges we move some of the vertices slightly.
Now, we come to the minimum SPED which has a 2-approximation. First we change
the concept and that way that we require now in our vetting so that we have a fixed
vertex positions. And such graphs are called geometrically embedded graphs. And for
maximum SPEDS for geometrically embedded graphs there is the -- it can be computed
by dynamic programming in N log N time if the graph is 2 planar and is NP-hard to
compute in general.
So we consider now the minimum SPED. And here we get a 2-approximation of N to
the 4 time where the approximation is quadratic in the number of crossings and the
crossings is quadratic in the number of edges.
As I said before, the minimum SPED is if we minimize the gaps of a drawing and
therefore we show that a transformation from the minimum SPED to the minimum
weighted 2 set. And what we already know is that this problem has a 2-approximation.
Okay. What we do here is we consider and arbitrary edge and the crossing edges as
denoted by F1 to F3. And we order the crossing edges FI according to the distance to
the closest endpoint of the edge. And the pair of segments EI are now the part from --
for example, from this vertex 2F2, if we define the one part of the segment that E1. So
EI in yen is defined by the I plus one's crossing edge.
And those segments create no variables in the instance of our problem. And we say
that the pair of segments is not drawn if the variable in the instance is true.
Then we have to introduce the implication that if we've drawn segment EI plus 1 the
segment EI must be drawn too.
And, of course, for every crossing we introduce another clause. Now both of the
segments will be drawn for each crossing. And for every variable we introduce a weight
WEI, and we get now if instance whether the problem of minimizing the weighted sum of
the variables for over all the valid variable assignments.
This is a quite interesting thing but what does it have too do with the maxSPED? For
exact solutions minSPED is equal to the maxSPED. But to have to get a true
approximation for the maxSPED, we would need a transformation to the maximum
weighted problem. This would also solve maximum independent set problem. And this
is -- this has no 2-approximation as far as I know. And so it is unlikely that the
maxSPED has 2-approximation.
So let me conclude now the results. We have proven that there is no quarter SHPED
for the complete graph with more than 212 vertices.
We found some classes admitting delta SHPED and computing maximum SPED is
NP-hard in general, and the minSPED can be 2-approximated.
For the future we want to find more classes of graphs admitting a delta SHPED, and we
want to close the gap between 16 and 212 vertices. And as we saw in the picture of
one of the first slides, it's -- it might be possible to prove the bound to 17, but it's -- it
seems to be very, very hard.
And of course we want to generalize this concept not to have a ratio of a quarter but a
general ratio. Okay. That's it.
[applause].
>> Host: Any questions for Till? So let's thank you to Till again. Thank you.
[applause].
>> Host: So title the last talk of this session is Implementing a Partitioned 2-page Book
Embedding Testing Algorithm. And Marco will give the talk.
>> Marco Di Bartolomeo: Okay. Good morning. My name is Marco Di Bartolomeo.
And this presentation -- I'm from Roma Tre University. And this presentation is a joint
work with Patrizio Angelini and Giuseppe Di Battista.
A 2-page book embedding is a type of graph embedding where the nodes are a line
along the line called the spine. And the edges are assigned to two half-plains sharing
the spine called pages. It is done in such a way that in each page the edges do not
cross.
They are in the partitioned 2-page book embedding graph problem for a graph consists
of finding a 2-page book embedding for it. And the assignment of the edges to the
pages is part of the input.
This problem lies between other two very important graph drawing problems, cluster
planarity and simultaneous embedding with fixed edges which are very hot topics
currently. [inaudible] can be used to solve cluster planarity when there are only two flat
clusters. And can be used to solve SEFE when the intersection graph is a star.
We find it interesting and promising in the way how the book embedding gives a
different point of view on these two problems which are apparently very far from it.
Hong and Nagamochi solved P2BE with a linear time algorithm. And they characterize
the problem as the problem with finding a particular planar embedding of the input
graph.
In such embedding it's possible to build and auxiliary graph from it. And finally, a
special Eulerian tour from these Eulerian graph gives an ordering for the nodes along
the spine of the book embedding.
Now I will give you a very fast overview of the algorithm. So don't worry if you don't get
all the details now. Let's say you have a graph and the algorithm starts searching for an
embedding for it. When one is found the auxiliary graph is built from it. And finally the
Eulerian tour gives the ordering for the nodes. The Eulerian tour must be non-self
intersecting, meaning that it is possible to draw it without crossing the already drawn
line.
Our contribution to this work is an implementation of the algorithm. We perform some
modifications to the original algorithm in order to make it more efficient, to improve the
efficiency.
Please note that it's not about asymptotic time complexity since the algorithm was
already linear.
We instead loaded up the constant factors of the running time. Along with the
modifications we have done there is a part of the SPQR-tree algorithm which finds an
embedding, the part that deals with P-nodes that use a brute-force approach and code
generation.
Finally we give -- we add the detail, details to the proof of correctness for that part of the
algorithm that searches the Eulerian tour.
Now, we'll discuss the details of the algorithm. This part is common between our work
and the original work. [inaudible] a plane embedding that is said to be dysjunctive and
splitter-free. Recalling that the edges are partitioned into two groups, let's call them red
and blue, we say that a node in an embedding is dysjunctive if all the incident of the
edges are consecutive.
While a splitter is a cycle, all edges all have the same color where the -- where in both
the areas of the plane the cycle identifies there is either a node or an edge of the other
color.
For example, the first cycle in the picture is not a splitter. See it's empty outside. A
disjunctive and splitter-free embedding must be found for every biconnected
component. And if it is done by a [inaudible] programming algorithm based on SPQR
trees which collect information from the children of each node have the SPQR tree.
This information is about the presence of colored paths between the paths of each child.
However only those path that are part of some colored cycle are of interest, since they
could be a splitter.
Please note that we also add information on the parent of this picture and later I will
show how it is possible. With this information it is possible to make a bad decision -decisions. In the figure there is a virtual edge which has a wrong flip. And this
produces a splitter which otherwise would have been avoided.
You can see that the colored cycle separates two nodes in the graph. Once a dysjunct
-- splitter-free embedding as we found the information is combined and set for the
current node. A similar work is done for dysjunctiveness. The [inaudible] information
collected from children is about the color of the incident to the edges to the two poles
and the ordering of such edges.
For example, these edges -- this edge add only blue edges in the poles while this one
has all blue followed by all red. We call such and ordering color pattern. A color pattern
is collected for each child.
We don't have color patterns for the parent at this time, but we know which colors are
incident to the poles. Again, the information can be used to make bad decisions. In the
picture there is a virtual edge, which -- that has a [inaudible] flip, making a node to be
not disjunctive. See it's here it has more than two color changes in the clockwise
ordering of the incident edges.
They are finally the information is combined and the set as a parent are color patterns
for the current node.
Now, we will discuss the modifications we have done to the algorithm in order to make it
more efficient. I said before, we have information on the parent as well. And this
information is about the presence of colored paths in the parent and the colors incident
to the poles.
This information can be collected by preprocessing the SPQR tree. That tree is a
bottom-up traversal of the SPQR tree, followed by a top-down traversal which combined
collected information.
Another important point is that constraints imposed by dysjunctiveness and splitters
must be considered together since they could be in contrast. For example, in this
picture there are two children whose flips are constrained by dysjunctiveness due to
these red edges.
But unfortunately this produces a splitter, which is not valuable, meaning that this -- the
instance is negative; that is there is no book embedding for it. [inaudible] we need
some information for both the parent and the children. We need to know about the
presence of colored paths and the colors of the edges incident to the poles that are
color patterns for the children.
Everything is combined and used at the same time in order to greedily construct an
embedding for an SPQR-tree node.
The original algorithm doesn't do that. In particular, it doesn't exploit the pre-processing
step to collect information on the incident edges. And for this reason he has to build the
[inaudible] embeddings for each SPQR-tree node, in particular all the possible
disjunctive embeddings, leaving to the parent selecting one.
The algorithm is still linear since those bad things the same color patterns are
equivalent.
Our approach, however, do exactly one embedding per node, being more efficient.
Moreover, by handling all constraints at the same time, we can early detect negative
instances.
Among the modification we have done to the algorithm is the part of the SPQR
algorithm that deals with P-nodes. And that being a P-nodes basically means finding a
good per mutation for the virtual edges composing the parallel. In this problem, it is
possible to group together those edges having uniform color patterns, for example,
these two.
Grouping the virtual edges reduce the size of our P-node up to 8 virtual edges. That is
a [inaudible] constant. And for this reason, it is possible to use bruise force to search
for a solution. Basically these can be matched against both possible solutions until one
is found.
The set of possible solutions is the set of those per mutations yielding disjunctive and
splitter-free embeddings and can be generated by an algorithm that starts from the
decided result. Let's say that we want a per mutation for giving and embedding without
specific pairs of color pattern. It must be noticed that given two color patterns there are
several ways to align them, making them slide on each other in this way.
For example, in this slide we can see three different alignments of the same pair of
colored patterns. Let's concentrate on the first one on the left as an example. This slide
shows how an aligned pair of color patterns identifies colored zones where virtual edges
with the same color can be placed in order to obtain a permutation with a desired color
pattern -- color patterns. And the algorithm that generally is the set of solutions simply
tries solve the combinatorial possibilities of this process.
The original algorithm performs a brute-force search as well, but it has among all
possible permutations which although limited by a constant, there are still quite a lot.
See a possible per mutation of 8 virtual edges must be tried. And all the possible flips
of the virtual edges.
We search in a much smaller space since we only match against good solutions being
more efficient. Moreover, the set can be batch generated by an offline procedure
making search very fast.
Our approach can be generalized in three steps. First, the desired properties of the
permutations are formalized in a specification. Then the specification can drive the
generation of the set of all solutions. And finally, the instance is matched against all
solutions until one is found.
Now, we'll describe the part of the algorithm that finds the Eulerian tour giving the
ordering for the nodes on the spine of the book embedding.
Given -- let's assume that we have a disjunctive and splitter-free embedding for the
graph. We construct the auxiliary graph by placing a node in each original face and by
placing -- by placing directed edges in such a way that on each node they separate red
edges from blue edges. In this way. Please note that the although the auxiliary graph
records the [inaudible] graph it does not, since it all -- it also contains original nodes of
the graph.
The auxiliary graph obtain its Eulerian and for this reason it can be decomposed in
cycles. Being an embedded graph the cycles are nested. And this hierarchy can be
represented by a tree.
Visiting the tree of cycles with a DFS visit clear gives Eulerian tour since each edge is
visited just once. It can be non-self-intersecting as well. The reason is that on each
node there is an outgoing and incoming edges are alternating the clockwise ordering of
the node. And by following these alternates it is possible to avoid intersections.
By following this principal, when passing from a father cycle to a child cycle that is
nested in it, it is possible to have a tour that is non-self-intersecting.
And now I will discuss the implementation of this algorithm. We implemented it in C++
language and used the two graph drawing libraries. GDToolkit which acts as the
mainframe work of our application, and OGDF for its capability to build SPQR-trees in
linear time.
I said before that the P-node embedding part of the algorithm must perform a
brute-force search. It is fast for a computer, but there are quite many cases for a
human programmer. They are 180. And for this reason we develop the code generator
that given the specification of the good per mutations it generalities the C++ code
performing the brute-force search. And please note that the same algorithm also
produced the table in this slide.
And now the experimental part. This is the environment we use it. Basically a Linux
platform with standards C++ language.
In order to test for performance we need a lot of positive instances. This result we
develop the graph generator. Actually a solution generator, since it generalities
book-embedded graphs.
The parameters of this generator are number of nodes and the number of edges. We
use the generator to generate three sets of instances which have different edge
densities. This is motivated -- we did this to have different kinds of SPQR trees. In
each set the graph size ranges from 10,000 to 100,000 of nodes. And each value is
averaged among five graph of all equal size.
Now, a few charts showing the statistics on our graphs. It can be seen that when the
edge densities increase, the number of connected components decrease. That is quite
obvious. For example, in the last set we only have triconnected graphs since they are
basically triangulations. The same things can be seen by SPQR-tree point of view.
With the last set having graph with only R nodes in the SPQR trees.
Those are our experimentation [inaudible] since the application runs linear time with our
data set.
To conclude, the results ever our work is an implementation of an algorithm to tackle the
P2BE problem. And with it we did it by modifying and original algorithm by Hong and
Nagamochi with modification aimed at increasing the performances with [inaudible] a
part of the SPQR-tree algorithm the part that deals with P-nodes with a brute-force
approach which uses code generation.
What's more? In the future, variation of P2BE should be investigated because of its
relationship with the other problems seen at the beginning.
Variations should be investigated in order to try to model more case of SEFE. We have
seen that P2BE models the case when the intersection star, the intersection graph is a
star and one could try to generalize this result by modeling the case when there is a -when there are more than one star, a double-star, a caterpillar or a tree. That's it.
Thank you for your time.
[applause].
>> Host: Any questions for Marco?
>>: [inaudible] actually don't understand why you need [inaudible] why you actually use
->> Marco Di Bartolomeo: Although it's not augmented, the [inaudible] implementation of
SPQR tree is not very fast. It's not efficient. It's not linear. I think it's something like
quadratic or cubic. Because when it was developed it's not aimed to be efficient.
However, the GDToolkit we found it to be quite easy to use. And we are -- we were
very used to it, so it work very well as a mainframe work for our application. And we
mixed them together by translating SPQR trees build from GDF to the GDToolkit.
>>: My question is -- and this problem [inaudible] you kind of search for a [inaudible]
you search for embedding such that [inaudible].
>> Marco Di Bartolomeo: It's not for the dual graph. It's [inaudible].
>>: [inaudible].
>>: Have you also studied the problem that you don't [inaudible] vertex on the end has
to be the same. Have you also studied the similar problem when we don't search for
this circle but for a path because this could be related to double and [inaudible]
recently?
>> Marco Di Bartolomeo: Okay. Actually not, because we ->>: Because in this case you can have -- you can allow for the separators but only for
two of them because the one separator when you start this Eulerian path and mark the
separator which [inaudible].
>> Marco Di Bartolomeo: Yes, but [inaudible] models [inaudible] the only way. You
have to start at the beginning and ends to the same point.
>>: [inaudible] you don't have to restrict [inaudible] and essentially -- because this
2-page embedding is also relative to these two graphs where you have this [inaudible]
Hamiltonian circle and this planar graph and [inaudible] Hamiltonian circle by
Hamiltonian path. And the same happens [inaudible] Eulerian circle [inaudible].
>> Marco Di Bartolomeo: I see. That's something [inaudible].
>> Host: Any other questions?
Okay. So let's thank you -- did you have a question?
>>: [inaudible] the problem becomes if you could [inaudible] constraints on the
[inaudible].
>> Marco Di Bartolomeo: Yeah. The original motivation for our work was the
relationship between the SEFE problem and this kind of book-embedding problem.
These result from Angelini, et al., in 1210, I suppose, showing how the SEFE problem
with a connected intersection can be reduced to a SEFE problem with a three
intersection. And this can be reduced to a book-embedding problem where the order of
the nodes along the spine is constrained by a tree.
And that's -- that's the reason for this last point. Yeah. It should be investigated.
>> Host: More questions? Okay. Let's thank to Marco again for a nice presentation.
[applause]
Download