>> Host: So thank you for coming to the second session. And you have very nice full talks today. But I have one [inaudible] from the organizers. If you present games then please bring your laptop during the coffee break and also lunch break, please bring your laptop and then demonstrate your game near the poster session so that people can use your -- try your game. So today and tomorrow please bring your laptop and demonstrate your game, please. And the first talk will be given by Ulrik Brandes. Thank you. >> Ulrik Brandes: All right. That was -- now you should hear me better. This is joint work with my PhD student Mirza Klimenta. And Peter Eades calls it a myth that professors are important. This is based on the conjecture that there's an exponential decay in knowledge compared to the next generation of students that you have. And Mirza's my PhD student. He should be five times more knowledgeable than me, according to this conjecture. And maybe here's some evidence to support the conjecture. These are the last three thesises, PhD thesises that I had to review. This is Mirza's that he just turned in. [laughter]. It's the most comprehensive work on MDS for graph drawing that I have seen so far. And just to be sure, it's 11 point with small margins, right? [laughter]. So he should really be giving the talk, but it was too difficult for him to get here, so I will do that. It's about MDS and I just briefly recall was the principal is. Given some pair-wise dissimilarities between objects we're trying to represent them in some geometric space, two or three dimensional, such that the distances in that space resemble the dissimilarities that were input into the method. It's a whole family of possible approaches. And the two main variants are shown here or are illustrated here. In GD, the typical use of this method is that the dissimilarities that we input into the method are the shortest path distances in the graph. But other distances are possible. All right. And of course, the target space is two or three-dimensional because we would like to look at these drawings at some point. The two main members of these family or the main distinction between different members of this family is between classical scaling and distance scaling where classical scaling was the first method proposed in this. This is based on spectral decomposition so it's basically a spectral method and this one is a iterative method that tries to optimize some objective. You can see there's a qualitative difference between the results, qualitative not in terms of quality but in terms of their really distinct, the kind of results that we obtained, local details much better visible in this case, whereas large distances are represented very well in this case with some degeneracies. This is the comparison between the two. As I said, one of them is based on spectral decomposition, the other one is based on iterative minimization of a quadratic objective. For the spectral decomposition method one of the very good things is we have very fast approximation methods. For distance scaling this is not so much the case, even though there are speed-up heuristics. There are for -- because it's a spectral method if the spectrum is not well behaved, if we have multiple eigen -- higher multiplicity of the large eigenvalues there can be degeneracies. And you've seen some of those degeneracies as well because we're only representing a subspace. For distance scaling we have local minima because of the iterative method it highly depends on the initialization, and you've seen examples in the previous session that this may be the case. Because this is spectral decomposition, it's also considered to be very inflexible because we set up a matrix and then we look at the eigenvectors and there's nothing we can do about the result anymore. Whereas here we can fiddle with the objective function, we can change weights, we can introduce other elements into the graph and so on. So it's considered to be more flexible. In terms of the result, the -- well, the overall qualitative difference is that we see very well the overall shape of the graph in this case because of this tendency to represent higher -- larger distances more accurately, and we see more of the local details of the graph and the distance scaling method. So the conclusion of this comparison is that in practice, the typical way of using this is to use classical scaling as an initializer to unfold the graph very well and then improve the local details by running a few iterations of distance scaling. So as a simple way of memorizing this, C is before D, right? [laughter]. What I'm going to do in this paper, what I'm trying to show is that we propose two adaptations that address the inflexibility and the degeneracies of classical scaling. Because it is such a fast -- can be approximated very fast, we can apply this to very large graphs. But if we look at very large graphs, we still want them to look decent. So we though there are ways to modify classical scaling in such a way that we can control the output a little better than maybe if we apply it in the standard way. Because it's about classical scaling just in a nutshell, what is the spectral decomposition method. Our input have some dissimilarities. Typically they would be metric. But in graph drawing they are because we are using shortest path distances. The goal is to find coordinates represented as an N by D matrix where we're d-dimensional coordinates represented as rho vector such that the input similarities, the shortest path distances are represented well by the Euclidean distance in the target space. All right? And note that distances are closely related to inner products because we can represent the square of the dance as the sum of some inner products. So we can look at classical scaling as the reverse or at the reconstruction task. Because if we knew the coordinates we could compute all the inner products. From the inner products we compute the distances. And then we have the matrix that was given at the input. Now, we're trying to do the reverse not knowing even the existence of such coordinates. So we set up an inner-product matrix using this formula which is derived from the other or easy to recognize formula in such a way that one of the degrees of freedom is resolved. Because distances of course are translation invariant whereas the coordinates are not. And the way the degree of freedom this result has an effect on how this is set up. In this case, it's the centroid of the configuration that the placed in the origin of the result. All right. It's not -- the details are not that important. It's just easy -- or just important to see that we can easily set up this matrix by doing some computations on the input to similarities. And then we apply a standard method of decomposition. So this matrix is symmetric. We can do spectral decomposition on this matrix and get the coordinates out of this. All right? So this is what we do. We get the matrix of eigenvectors of B, the diagonal matrix of eigenvalues and again the eigenvectors and then we cut this in the middle by just taking the square root of the eigenvalues here for the D largest eigenvalues that we use as the dimension of the result. So not difficult technically. One -- two things to notice is that it is known that this decomposition is the best rank D approximation of the original matrix B. So what we're basically doing is we're optimizing an objective function that looks like this where the arrow in the inner product is penalized. Right? So we're trying to optimize this function implicitly. And because of that, we have a bias towards larger -- larger distances. This is where this comes from. So this degree of freedom of choosing the centroid as the origin is very important for what we're doing. So let us look at this more closely. It has been noted many times in the MDS literature but it has not been used for graph drawings so far. Let me just characterize it a little bit. If we do -- so if we have input coordinates of the sine dimensionality we're trying to reconstruct, so we're doing a full reconstruction in changing the origin, resolving this degree of freedom in a different way corresponds to rigid transformation. So we're just shifting the image around. If we do low dimensional construction because the input may not be metric at all, then this results in a distortion. All right? If we pick any particular vertex of the graph as the origin, we can set up the inner product matrix in this way, which looks almost like the previous one. Because the previous one was the average of all vertex coordinates, and now it's only one particular one. So in general, we can weight all of the vertices in way. So any convex combination of the other vertex positions can be chosen as the origin. And all that needs to be done is to set up the inner product matrix in a slightly different way. So we're not changing complexity and so on, we're just choosing where we would like the origin to be. And the effect is seen here. So this is the centroid of the configuration and the standard classical multi-dimensional scaling solution that you would get. If we pick, for example, this vertex, the result is slightly different because we're reconstructing different inner products. And if we pick, for example, the graph center, so the average of all of the vertices in the center of the graph, the vertices with lowest eccentricity, then the result would look that like. All right. So we can control a little bit of the shape by focusing on one particular area. And this is what the first contribution is that would -- with the suggesting method to do focus and context within the framework of classical scaling. All right? And the approach is fairly simple. The goal of the focus on context technique is to enlarge some area, to have more detailed view of something that is too large to be represented on the screen entirely so we focus on one area S but we would like to preserve the context enough to orient the user in the graph. This is the main idea. What we do here is we set the origin of the configuration to the focal point that we would like to -- like to address. And then we change the way that the inner products are being set up. And we change them in such a way that we have seen before that the inner products are now expressed in terms of some particular origin. We just control the contribution of the length of the distance from that origin and we control the angle between the two position vectors. So what this does is we can enlarge an area by choosing R appropriately, focus on all of the distances that are close to the focal area, and we can also spread the context around by changing the exponent for the angle. In comparison to previous methods, for example, for fisheye views of graphs which are very typical, this is an input space method. We're changing the data that is input into the algorithm rather than what is usually done in output space method where the distortion -- the method introduces already there and then it is additionally distorted. All right? So one thing to notice about this particular fisheye method is that the context is fairly narrow. Because we're enlarging some area in the [inaudible]. But by controlling Q we can large the context or use the empty space around the context to show more of it in the same drawing. All right? So it's one way to control it. And I think one of the good things about it is that the distortion is happening only when the method is applied and not -- we have -- we do not have these uncontrolled interferences between the two distortions that I introduced into the drawing. Because we have these two parameters, we can change -- we can use them differently. So for the same graph we can put the focus here where the color indicates how much the edges are distorted. Or we can put the focus here. So this is just a change of origin to show that we can actually enlarge different areas. But now if we change the parameters a little bit, we see that in this case the focal region is enlarged and the context is slightly shrunk by just changing the parameters in this way. And of course the graph has been chosen because things are easily observable. In this particular graph we can supply this also to bad layouts. And the same thing happens. But of course if the layout is such as bad in the beginning, the focus on context does not look that good on this graph as well, all right. Because it's not a good demonstration. But it shows that putting the focus here, putting a focus there, putting a focus there what's the same effect on any kind of layout. Sasha. >>: Is there a reason why you chose R and Q the same? >> Ulrik Brandes: No. That's for illustration here. It can be anywhere between O and 1. And here we see that when we choose them -- oh, no, we didn't choose them different. >>: [inaudible]. >> Ulrik Brandes: Oh, yeah, here it is different. It should be independent controls. But maybe I didn't demonstrate it enough. All right. So that's one thing. Because classical scaling can be approximated very fast this is an interactive method. So I claim that the layout usually takes less time than rendering the graph, so we can -we can do this in an interactive setting where we change the two parameters or where we change the focus to our liking. The second -- no, not the second, the addition to this first method is that maybe we have multiple foci that we would like to look at. In using the same type of projector that we used for picking one of the origins, we can also pick different focal regions and then combine the results. All right. So we do a weighted combination of all of the inner products. Then we have to pick a common origin or a common focus for all of them, which will affect the results. But what you can see is that one focus here, one focus there, and if we combine them, then we get these two enlarged areas as well. All right? But here it is a matter of where we choose the common origin what the actual results will be pause of that translation variance of the inner products. All right. Now, the second modification I think is the more important one. So this one is practically more relevant. I think this, the one that is coming now, is more important for the quality of the results. Firefighter if we consider classical scaling as a reconstruction task then a there are actual coordinates and we can rewrite the way that the coordinates are produced into something that resembles more a projection. It is actually known as principal component analysis, but I'm summarizing what it does. We have the original coordinates and these coordinates are transformed by equalizing the spread along the dimensions and orthogonalizing them. Never mind these are the eigenvectors of the inner product matrix of the coordinates. Using this we get some projection that has this effect. It can be shown that this is the maximization of the distances between points. All right. So we're spreading out the drawing in all directions as much as we can. This is what would happen if we knew the coordinates. Now, what we're going to change is this projection matrix. All right? Because the deficit of classical scaling really is that if we have these high dimensional coordinates to maximize this term it is greedily selecting the dimensions. This is what is the most important thing in this data transformation operations that we would like to maximize the variance along the dimensions. But for the drawing this may be very bad. So here the advisor |dimension that is picked is the long versions through the graph. And then the second dimension is the second longest X basically that we can find in this drawing. So this is the result. Even though there's three dimensional configuration that defines the input is not very well visible in this type of projection. All right? So what we'd really like to see is most of the graph. And how can we do this? Well, we changed the objective to a weighted version have the original objective by modifying the projection matrix. So instead of the eigenvectors of the inner product of the coordinates we're using a weighted version of this, where this is the Laplacian of some weighting scheme that I'm not going to detail out. So this has been observed by Yehuda Koren and Carmel. And explained in detail that we're just using it here to improve the projection that we get for the original coordinates. And this is the result if we just take the adjacency matrix as the weighting scheme, compute the Laplacian out of this and then enter this into the eigenvector computation, then this is what we actually see of this 3-dimensional projection. Because we're now maximizing the length of the edges in the graph, all right? The weights filter out all of those pairs that are not connected. So now we're not connecting -- not maximizing all the distances but only those distances between those that are adjacent. This is why we see more of the edges in this projection. Now, in our case, coordinates are not known in the beginning, all right in the more dimensions we have in classical scaling, the better the representation is. So while our new technique here is to start from the similarity matrix to do classical scaling, but in higher dimensions than we actually have as a target, to have more of the details represented in this higher dimensional space. But because most of us are not capable of looking at 10-dimensional drawings we then apply weighted PCA to get to the good projection of this drawing. So we choose some edge, for example, based on where the eye can get this, based on where most of the data is already represented, and then we do this type of projection. For example, in the same way that I showed before, but to be sure that we're not looking from a very bad direction that has lots of occlusions, we're adding a few random dyads to stabilize the projection with respect to what we would normally get. So just a few of those. And it can be shown, and that's in the paper, that this is again related to the choice of origin. The choice of origin can be expressed in this framework as well. This is the final slide, showing just one example. This is a typical 2D classical scaling of one of the matrix market examples. This is what we get when we do this in 6 dimension to represent more of the details in that higher dimensional drawing. And once we have the coordinates we choose a perspective that shows most of the graph. And even though this may look very fuzzy, it actually shows in this detailed view that we have all the differences between those nodes that would not be visible in the two dimensional projection because they're only represented in higher dimensions. This is a node that has half a million edges, the graph that has half a million edges. Using the basically linear time approximation algorithm for classical scaling, this is done in seconds on a regular laptop computer. All right? That's all I had to say. Thank you. [applause]. >> Host: Eric. >>: [inaudible] interactivity saying [inaudible] can you make a smooth transition from one [inaudible] to another [inaudible] by adjusting the weight? Is it possible? It looked like it's possible, but I ->> Ulrik Brandes: I haven't thought about it, but I think so. I think it's a good suggestion that we change the weighting scheme for the unit vector that focuses on one vertex to the unit vector that focuses on the other by just interpolating between them. >>: Seems by just adjusting the two parameters, yeah. >> Ulrik Brandes: Not the two parameters, but the choice of origin. So if we move the origin from one node to the other ->>: [inaudible] geometrically but ->> Ulrik Brandes: In input space. >> Host: Okay. Ben. >>: That was a terrific presentation. I appreciate it. I have two concerns. One is the fisheye representation. In the example you showed the zoom factor is only about two or three, and most studies show that the spatial instability is damaging and that the advantage only comes when you get five or more. So have you gotten examples where the focus versus context has a zoom factor of at least five and that really shows a value to the tool? >> Ulrik Brandes: That's a very good point. We -- we basically wanted to demonstrate that this method is more flexible than it seems to be. But of course that ->>: Magnification approaches or distortion based approaches are undermined by the lack of spatial stability. So that was a concern. The other about the multi-dimensional scaling, I'm still not convinced that it provides an effective representation because the places -- the parts of the graph where the representation is weak are not emphasized. So we don't know where to trust it and where to doubt it. Is there a way you see of solving that problem to make CS a better payoff? >> Ulrik Brandes: So I think that CS is really -- it's good at representing it's finite element measures because they have some geometric origin to them. If it's a graph like in the beginning, we still would like to use this as the initialization. But if the initialization is better, then the distance scaling will work better as well. The representation arrow is something that we thought about representing graphically, for example, by coloring or something. >>: [inaudible]. >> Ulrik Brandes: We don't know ->>: Try another idea which maybe you thought of or have a solution for, which is can you characterize a family of graphs for which MDS provides effective representations? >> Ulrik Brandes: Yeah. That depends on the spectrum. >> Host: Okay. So David? >>: Okay. Thank you. >>: So this relates to something you said at the start of the talk and then again just now that you say we should be doing CS and then DS. When you try it with this method, does the DS undo some of the differences that your weighting makes or is there a way to make your choice of weights for the CS part to carry those weights over into the DS space and see how it goes in. >> Ulrik Brandes: I'm not sure whether it's easy to control what's happening. I would think that we could limit the displacement that DS is actually doing in the end and maybe we could actually change the weights in the DS. About it that's something that needs to be explored. >> Host: Steven. >>: Why six? >> Ulrik Brandes: Oh, because that's where the eigengap was largest. So any additional dimension would not have added much more detail. >>: So in general, you can look at the eigengap and choose to go to the 20 or 7 or ->> Ulrik Brandes: Yes. >> Host: Okay. Thanks for giving a nice presentation. [applause]. >> Host: And the second talk will be given by Lowell Trott. And the title of the talk is Force-Directed Graph Drawing Using Social Gravity and Scaling. >> Lowell Trott: So this was work done with Michael Bannister, David Eppstein and Michael Goodrich. So just a brief outline. We're going to cover a little bit of background. Briefing on force-directed methods because we've been talking about that a little bit earlier. What we mean by social gravity. What we mean by scaling. And then we'll look at some examples in trees and drawings of social networks, with a brief foray into some things that concern Lombardi. So quickly what are force-directed methods, what we'll also refer to as spring embedders? They're a method for drawing algorithm -- or an algorithm method more drawing graphs that are fairly flexible. And a nice feature is that they require no domain knowledge about the graphs being input. They -- the model we're basing our algorithm off is -- uses primarily two simple forces, a repulsive force between all nodes and attractive forces between adjacent nodes. So some of the challenges that these methods face is that modeling a physical system tends to have local minima that you can get trapped in and not produce the optimal drawing that you would prefer. And modifying the sort of simple forces by adding additional forces can make it even harder to get out of these local minima. So what are we going to do? Let's add more forces, see what happens. So social gravity is the source we're going to be adding to the simple set. The way this breaks down is the force factor on a particular node is made up of a gravitational constant that we'll talk about in a second. The mass of a node which we'll actually use social metrics to define. And then that node's relative position in a relationship to the center of mass of the graph. So what do we mean by mass in terms of a social metric? Well, we're going to use something causality centrality to define which nodes are important. Centrality is a metric used in social network analysis to give some understanding of nodes or rather the entities they represent in a social sphere, importance based on their structure, their location in the graph structure. So there's a few different definitions. We get into them here that we'll be referring to. One is degree centrality. So pretty straightforward. Nodes degree centrality scales with its degree, the idea being that if a node is higher degree and maybe a social context it's more likely to get infected by an infection traveling in the network just because it's more connected. Closeness centrality. It's just a measure of how close a node is to all other nodes in the network. So it's the inverse of the mean distance from a node to all other nodes. And as you can see in the drawing to the right, as a node kind of feels more central, you'll see the closeness centrality raise our heat map. It kind of goes from lower, cooler colors to brighter reds. That will be important later. And then finally betweenness centrality, which is just a summation of the fraction of shortest paths going through a particular node. So the idea being that if you're a single node, you consider all shortest paths between pairs of other nodes in your graph and then some number of shortage paths between those nodes will pass through you and some will not. So your fractional importance is the number of those that do pass through you over the total number of shortest paths. And then you just sum that over all possible pairs in the graph. And as you can see in the example to the right the centrality of the top two nodes is lighter than the centrality of the bottom center node because the shortest paths between pairs from the left to right are split between -- or the possible shortest paths are split between those nodes, whereas in the bottom example they're not. All right. So what do we mean by scaling? The idea is that we couldn't from the onset of our algorithm just input the gravitational force as a constant value and run the algorithm. So if we look at a simple forest here laid out with a classic inverse square spring embedder, you'll see that it has kind of separated a little bit. Now, if we add our gravitational force, and you have to -- this is where I need audience imagination. These are slides are PDF now, so they don't actually animate anymore. But that graph over to the right that is object secured will slide over to the center. You can see this graph the subcomponents of the forest have been pulled together using gravitational forces. Now this was done with a gravitational force that was consistently high throughout the layout of the algorithm. And you'll see that it's a pretty decent drawing. We don't have any crossings. The subcomponents were closer together, maybe easier to view. And then in the next drawing we'll see that this is a drawing where the gravitational forces were scaled up. So gravitation started at a very low value and increased over time until we achieved a drawing that we thought was acceptable. And then if we compare the two, you'll see that there's not many differences here. So this is actually kind of a counterexample to what I'm about to tell you. But it works pretty well in the small case. The gravity seems to not affect what we're doing, how that gravity's increased. But if we look at this example, a little bit of a larger forest, we'll see that there are actually a lot of crossings. One, two, maybe 16. But this -- this forest, because it's a little bigger and a little more complicated, wasn't able to cleanly break from itself before we began a gravitational layout. So in the next case you can see the same forest where gravity started out low and slowly increased until the final drawing was achieved. And here we get a lot cleaner visualization. So there is actually one crossing in there. I'm not going to tell you where it is. You have to find it for yourself. So how do we do our scaling? Well, we have this very implicated diagram here. Not really. Because what we find our in algorithmic study is that how we tweaked our scaling wasn't really as much important as that we did increase from a lower value to a higher value. So we used this simple step function, a few iterations, increase gravity, a few iterations, increase gravity, and so on and so on ideal we achieve something we thought was a nice drawing. What we found is that varying the iterations before gravitational increase or the size of that gravitational increase didn't actually modify our results too much. Obviously there's some work here to see the effects of a -- maybe a more complicated function on the final layout. And so now we'll look at some examples in terms of trees. So here's a tree drawn with closeness centrality coloring. So the red nodes, as you might imagine, have a higher closeness centrality value. And when you see a drawing [inaudible] this tree kind of spreads out. And some of the connections that the edges might be harder to determine cleanly. You get a kind of a less angular resolution out at the extremities. But when we draw this with a little bit of gravity, what we see is that our total area coverage per node is decreased. So a higher node density and a better distribution or, excuse me, a more uniform distribution of nodes. Now, you do kind of sacrifice some of the I guess immediate contextual notion of where nodes are in the tree. But if you pay close attention, you can see that there are no crossings, and separate tree paths still remain fairly separate, and our higher centrality nodes, because of that gravitational constant, are pulled towards the center. So here we have a convenient coloring but if you were labeling these nodes with something else not centrality, the nodes that you might prefer to learn about in a social context, those with high centrality based on your decision of which metric, are in the center of the graph. So those would be probably viewed more aptly by someone trying to examine the data. So here's another tree. We can look at this, this tree drawn as well and there would be some nice animations here. But imagine -- imagine trees sliding in. This is the same tree drawn with a few different types of centrality. So up in the left here is degree centrality. This is closeness, and the far right is between the centrality. So this is just to illustrate that although the centralities are somewhat related, they do differ. But the structure is generally the same for these simple trees. Because centralities have some -- share some notion, the structure of the final drawing tends to be similar. Now, here's an example of a much larger forest with a few interesting pieces. The larger pieces trees of the forest are drawn towards the center because they contain nodes with higher gravitational masses. And you still get, despite the number of nodes, the connections, crossing free drawing. So moving on to actual social networks, here's just a few examples. On the left you'll see a social network that kind of looks like a tradition drawing. Maybe the nodes that have high -- or long edge connections out are further away from the center than you might hope based on their centrality value. On the right, as you can see, those nodes are drawn in towards the center. And there's a definite increase in angular resolution on the extremities. Here's another tree drawn with closeness centrality. And finally one with betweenness centrality. And as you can see, the increase in uniform node distribution really helps in sort of distinguishing which nodes are connected. You kind of lose that in this region, but you're able to understand that a little better on the right. So not to burn it out but we've got a few more real world social networks. So this is an infection graph with closeness centrality. And one more we betweenness centrality coloring. So even as the nodes get more complicated, we -- or the graphs get more complicated, we still gain an angular resolution and a little bit of understanding of what local node placement. So a brief foray into Lombardi. So hopefully most of you who were here last year heard Olga who Mark Lombardi is. He's an artist who drew by hands conspiracy networks with long angular -- or, excuse me, long arc edges. So on the left here we have an example of his work. And on the right, a straight line approximation of that drawing. And then so if we slide this over, we can see that same graph drawn with our algorithm. Now, while we don't maintain the same node placement that Lombardi used when he hand constructed this, we do gain some movement of the high centrality nodes towards the center. These -- I don't remember if I said are conspiracy networks. So you might actually want to know that the nodes more central to the conspiracy would be located at the center of your network. And then here's another example. And on the right that drawing with our algorithm. And then because this is a force-directed method you can actually use one of the results from last year to do a force-directed Lombardi drawing to augment the forces used and add curvature to the edges. So on the right you'll see some long curved path that you would hope to get from those results. So our gravitational force actually works seemingly well with the Lombardi force. So now let's take a brief look at our algorithms in action. So if I can full screen this. So here you see it -- oh, sorry. A tree being laid out. And as the gravitational force increases you'll see those extremities spread and be drawn towards the center. And the nodes reach a more uniform distribution. And obviously if gravity gets too high, you may have caught it there, you get one crossing. So there is some limit to how high you can take gravity before your drawing starts to degrade. Here's a small social network. And so the pulsing that you see happening is actually the scale of the drawing increasing. So as the gravity pulls the nodes together we're -- and the bounding box shrinks, we're blowing the drawing up. So just to maintain visibility. But when you think of relative position, you still get the benefits we're looking for. So here in the forest you'll see that the standard forces actually repel the subtrees. This is where that scaling comes really helpful. So this allows some of those crossings that would get caught up to kind of cleanly than eliminated. And then the gravity begins to draw the subtrees back in until you get the drawing you're going for. All right. Great. That's all I have. Thank you very much. [applause]. >> Host: [inaudible]. >>: Do you have a way to control whether the subtree of the trees have the forest overlap in the ends? Because it looks like they were repelling each other. >> Lowell Trott: Yeah. So they're repelling each other with the standard repulsive forces. So one thing that we didn't get into too much is balancing those repulsive forces with the gravitational forces. We kept the gravitational forces fairly light in the comparison because we thought that the traditional methods seem to construct really nice graphs on a regular basis. So we did want to overshadow that with a really intense gravitational forest. But, yeah, if you want the kind of intermingling you can reduce that initial scaling period. So have stronger gravitational force at the beginning >> Host: Other questions? >>: Yes. So if [inaudible] showing circular outlines in the end, since most computer screens or pieces of paper are rectangular is there a way to define gravity in a way that you would get something rectangular? >> Lowell Trott: We thought about that. I don't actually remember how the -- yeah, maybe it's ->>: [inaudible] actually tried and [inaudible] L1 metric or something like that instead of -normal you put in a metric under the hopes that it would -- it just produced chaotic -[laughter]. But I think it would be very simple, and I didn't try this one yet, but to simply, you know, get oval shapes so that you could maybe fit a wide screen monitor better by scaling one of the components. That should work. I don't see [inaudible] to get squares [inaudible]. >>: So [inaudible]y overall structure of the graph. Have you done any studies to see if these outweigh the costs to [inaudible]. >> Lowell Trott: No, we haven't. But that -- I would love to see the results of the spending. >> Host: More questions? Okay. Let's thanks to Lowell again for nice presentation. [applause]. >> Host: And the title of the third talk is Progress on Partial Edge Drawings. And the talk will be given by Till. >> Till Bruckdorfer: Okay. So I'm going to talk about progress on partial edge drawing. It's joint work with Sabine Cornelsen, Carsten Gutwenger, Michael Kaufmann, Fabrizio Montecchiani, Martin Nollenburg, and Alexander Wolff. So first of all, what is a partial edge drawing? Consider an arbitrary graph with cluster caused by many edge crossings. And one of the main question in graph drawing is to reduce the number of crossings. The tradition way is to simply change the embedding. But what we do is we break the edges so every edge becomes two pieces which we call stubs. And now we shrink the size of the stubs. So if you remove the middle half of every edge, we get a layout with only 23 crossings compared to the over 400 crossings before. Can even reduce the number of crossings to zero and -- yeah. This is a layout showing that. And the motivation for this is a talk from last year given by Burch, et al. And he evaluated partially drawn links for directed graph edges and considered two layouts for edges, the tapered and traditional drawings of edges and considered variance of length of these stubs. And he wanted to know if such a drawing makes it okay to understand the graph. For example, the question is there possible link to connecting the red and the green node? And the users of the study must click some buttons yes or no and so on. And the resulted was that for this question and when we shrink the size of the stubs the error rate increased. But if we consider another question like pointing out the node has the nicest degree then we can see with lowest stub size the error rate decrease. So and this is very interesting because at the length of 75 percent there is a dip so that most motivated has to introduce a formal definition for partial edge drawing. So what we do now is partial edge drawing is defined where every edge is replaced by two stubs incident to the start and the end vertex and the stubs are not allowed to cross. So if you ask now which graph emit such a drawing, the answer is any graph because you can draw the stubs sufficiently smaller. To make the question more challenging, we introduce other properties. Consider the complete graph with 6 vertices. Then we can introduce a symmetry that means both stub of every edge has the same length and we can introduce homogeneity. That means the ratio of the stop with edge length is the same over all the edges. In this case, we removed the quarter of every edge. And, of course, we can combine properties to a symmetric homogenous partial edge drawing. This is an interesting drawing because if the ratio of the stub is known and you can guess from the length of the stub and the ratio where the corresponding N vertex is. And so we consider those drawings with a given ratio. In this case we have the ratio of a quarter. That means one stub. And it's a quarter of the total edge length. And we also consider the case of symmetric path edge drawings where we maximize the sum of the stubs. So what I'm talking about is first we show that a quarter SHPED does not exist for a complete graph that's more than 212 vertices. Second I show some existence of delta symmetric homogenous partial edge drawing for some classes of graphs. And finally, we consider the minimum SPED which has a 2-approximation. A minimum SPED corresponds to the maximum SPED if you minimize the gaps instead of maximizing the stubs. Okay. First of all, we require no embedding and fix the ratio to a quarter to make it more easy now. Been there is a result from the previous work that the complete graph is at most 16 vertices. It [inaudible] quarter SHPED. And a drawing of this looks like this as you see on the left side. And it seems to be impossible to add a further vertex in this drawing without any crossings. But what we can only prove is that for 212 vertices it's not possible to draw a quarter SHPED. And our first deal was we considered points with 30 to 50 points and [inaudible] says that then we have 17 points on the half side of convex curve and for those points we can easily show with the following techniques that such a drawing does note exist but the number of points is very huge. So we reduced the number with taking a closer look to the point-set. So if you consider an arbitrary point-set you choose two points with a maximum distance and rotate the points so that there's a line connecting line is horizontal [inaudible] a rectangle in closing all the points. And now we consider just the top part of this rectangle. Here we have three points, a left, right, and top point having stubs and they define partition of this rectangle with a property that every point in such a set has three stubs crossing the center boundary. And now we compute for every cell of these 26 cells how many points fit in these cells. And as an example, we consider the red sign cell where we have one point -- at most one point with a property that it has three stubs crossing three different segments of the boundary. And the further point has two stubs crossing one segment of the boundary. And from the slope of the top stub of the lower point and the size of the stub, the left stub of the upper point, we can compute that there are only two points possibly in this set. And so we do for all the cells and get a total number of 107 points in the top part of the rectangle. Now, if we double this number and take into account that the left and the right point are counted twice, we get this number of 212 points. Okay. Now, we come to the existence of a delta SHPED for some classes of graphs. And the key concept is that we enclose every vertex and its stubs by a circle, and if we know that the circles don't intersect, we know the stubs do not intersect. And although there might be drawings where circles do intersect but the stubs don't intersect. So if we consider some vertices with a certain distance of two connected vertices, we see that one unit of distance covers two times the radius of such a circle. So we can compute from the distance the ratio of a stub so that it is contained in the circle. And in this case, for a distance D, we have one over 2D as the ratio. If we consider now graph with bandwidth K that are graphs where we can order the vertices along a line and if there are vertices connected, the distance of these vertices is at most K, then we are sure that there is a 1 over 2 square root of 2K SHPED, and that way that we arrange the vertices in a snake-like fashion and compute the maximum distance. And this is square root of 2K in this case. And by our first consideration with the key concept we can find this square root of 2K in the ratio. Now, for K-ciruclant graphs, we can do similar consideration. K-circulant graphs are graphs where we can order the vertices on the boundary of the circle and every vertex is connected to at most K left or right vertices. So we take the same snake-like fashion and draw it on an anulus and if you compute the distance we can get again a value of 4 square root K replacing it in the key concept formula, we can prove this. Now, we can change the concept. Instead of circles we can consider rectangles. And I'm sure that they don't overlap. For considering bipartite graphs we considered just one set of vertices, which we place side by side and can compute from the ratio the number of vertices in the row. And we can also compute the number of vertices if we place the vertices of one set on top on top. In the way that if we've placed one vertex the other one is upper bound by the value of one minus delta and we can iterate this to the lower bound of delta. So if we see in the formulas the value of one over delta we know there's a side-by-side placement of vertices, and if there is a value with logarithm we know there is a top-on-top placement of vertices. So if we consider now the complete bipartite graph on 2K plus N vertices, we see there is a bound for K with a logarithm so there is a top-on-top placement for the construction. As you see in the top and the bottom of this. And the remaining N of vertices are placed on a horizontal line. In this case if we have a ratio of a quarter, then this K is bounded by 4.8. And we can also consider the complete bipartite graph the 2N vertices and there both of the vectors appear. So that means we have side-by-side placement in the column and top-on-top placement of the columns looking in the horizontal direction and both on each side of this drawing. [inaudible] to have no overlappings of the edges we move some of the vertices slightly. Now, we come to the minimum SPED which has a 2-approximation. First we change the concept and that way that we require now in our vetting so that we have a fixed vertex positions. And such graphs are called geometrically embedded graphs. And for maximum SPEDS for geometrically embedded graphs there is the -- it can be computed by dynamic programming in N log N time if the graph is 2 planar and is NP-hard to compute in general. So we consider now the minimum SPED. And here we get a 2-approximation of N to the 4 time where the approximation is quadratic in the number of crossings and the crossings is quadratic in the number of edges. As I said before, the minimum SPED is if we minimize the gaps of a drawing and therefore we show that a transformation from the minimum SPED to the minimum weighted 2 set. And what we already know is that this problem has a 2-approximation. Okay. What we do here is we consider and arbitrary edge and the crossing edges as denoted by F1 to F3. And we order the crossing edges FI according to the distance to the closest endpoint of the edge. And the pair of segments EI are now the part from -- for example, from this vertex 2F2, if we define the one part of the segment that E1. So EI in yen is defined by the I plus one's crossing edge. And those segments create no variables in the instance of our problem. And we say that the pair of segments is not drawn if the variable in the instance is true. Then we have to introduce the implication that if we've drawn segment EI plus 1 the segment EI must be drawn too. And, of course, for every crossing we introduce another clause. Now both of the segments will be drawn for each crossing. And for every variable we introduce a weight WEI, and we get now if instance whether the problem of minimizing the weighted sum of the variables for over all the valid variable assignments. This is a quite interesting thing but what does it have too do with the maxSPED? For exact solutions minSPED is equal to the maxSPED. But to have to get a true approximation for the maxSPED, we would need a transformation to the maximum weighted problem. This would also solve maximum independent set problem. And this is -- this has no 2-approximation as far as I know. And so it is unlikely that the maxSPED has 2-approximation. So let me conclude now the results. We have proven that there is no quarter SHPED for the complete graph with more than 212 vertices. We found some classes admitting delta SHPED and computing maximum SPED is NP-hard in general, and the minSPED can be 2-approximated. For the future we want to find more classes of graphs admitting a delta SHPED, and we want to close the gap between 16 and 212 vertices. And as we saw in the picture of one of the first slides, it's -- it might be possible to prove the bound to 17, but it's -- it seems to be very, very hard. And of course we want to generalize this concept not to have a ratio of a quarter but a general ratio. Okay. That's it. [applause]. >> Host: Any questions for Till? So let's thank you to Till again. Thank you. [applause]. >> Host: So title the last talk of this session is Implementing a Partitioned 2-page Book Embedding Testing Algorithm. And Marco will give the talk. >> Marco Di Bartolomeo: Okay. Good morning. My name is Marco Di Bartolomeo. And this presentation -- I'm from Roma Tre University. And this presentation is a joint work with Patrizio Angelini and Giuseppe Di Battista. A 2-page book embedding is a type of graph embedding where the nodes are a line along the line called the spine. And the edges are assigned to two half-plains sharing the spine called pages. It is done in such a way that in each page the edges do not cross. They are in the partitioned 2-page book embedding graph problem for a graph consists of finding a 2-page book embedding for it. And the assignment of the edges to the pages is part of the input. This problem lies between other two very important graph drawing problems, cluster planarity and simultaneous embedding with fixed edges which are very hot topics currently. [inaudible] can be used to solve cluster planarity when there are only two flat clusters. And can be used to solve SEFE when the intersection graph is a star. We find it interesting and promising in the way how the book embedding gives a different point of view on these two problems which are apparently very far from it. Hong and Nagamochi solved P2BE with a linear time algorithm. And they characterize the problem as the problem with finding a particular planar embedding of the input graph. In such embedding it's possible to build and auxiliary graph from it. And finally, a special Eulerian tour from these Eulerian graph gives an ordering for the nodes along the spine of the book embedding. Now I will give you a very fast overview of the algorithm. So don't worry if you don't get all the details now. Let's say you have a graph and the algorithm starts searching for an embedding for it. When one is found the auxiliary graph is built from it. And finally the Eulerian tour gives the ordering for the nodes. The Eulerian tour must be non-self intersecting, meaning that it is possible to draw it without crossing the already drawn line. Our contribution to this work is an implementation of the algorithm. We perform some modifications to the original algorithm in order to make it more efficient, to improve the efficiency. Please note that it's not about asymptotic time complexity since the algorithm was already linear. We instead loaded up the constant factors of the running time. Along with the modifications we have done there is a part of the SPQR-tree algorithm which finds an embedding, the part that deals with P-nodes that use a brute-force approach and code generation. Finally we give -- we add the detail, details to the proof of correctness for that part of the algorithm that searches the Eulerian tour. Now, we'll discuss the details of the algorithm. This part is common between our work and the original work. [inaudible] a plane embedding that is said to be dysjunctive and splitter-free. Recalling that the edges are partitioned into two groups, let's call them red and blue, we say that a node in an embedding is dysjunctive if all the incident of the edges are consecutive. While a splitter is a cycle, all edges all have the same color where the -- where in both the areas of the plane the cycle identifies there is either a node or an edge of the other color. For example, the first cycle in the picture is not a splitter. See it's empty outside. A disjunctive and splitter-free embedding must be found for every biconnected component. And if it is done by a [inaudible] programming algorithm based on SPQR trees which collect information from the children of each node have the SPQR tree. This information is about the presence of colored paths between the paths of each child. However only those path that are part of some colored cycle are of interest, since they could be a splitter. Please note that we also add information on the parent of this picture and later I will show how it is possible. With this information it is possible to make a bad decision -decisions. In the figure there is a virtual edge which has a wrong flip. And this produces a splitter which otherwise would have been avoided. You can see that the colored cycle separates two nodes in the graph. Once a dysjunct -- splitter-free embedding as we found the information is combined and set for the current node. A similar work is done for dysjunctiveness. The [inaudible] information collected from children is about the color of the incident to the edges to the two poles and the ordering of such edges. For example, these edges -- this edge add only blue edges in the poles while this one has all blue followed by all red. We call such and ordering color pattern. A color pattern is collected for each child. We don't have color patterns for the parent at this time, but we know which colors are incident to the poles. Again, the information can be used to make bad decisions. In the picture there is a virtual edge, which -- that has a [inaudible] flip, making a node to be not disjunctive. See it's here it has more than two color changes in the clockwise ordering of the incident edges. They are finally the information is combined and the set as a parent are color patterns for the current node. Now, we will discuss the modifications we have done to the algorithm in order to make it more efficient. I said before, we have information on the parent as well. And this information is about the presence of colored paths in the parent and the colors incident to the poles. This information can be collected by preprocessing the SPQR tree. That tree is a bottom-up traversal of the SPQR tree, followed by a top-down traversal which combined collected information. Another important point is that constraints imposed by dysjunctiveness and splitters must be considered together since they could be in contrast. For example, in this picture there are two children whose flips are constrained by dysjunctiveness due to these red edges. But unfortunately this produces a splitter, which is not valuable, meaning that this -- the instance is negative; that is there is no book embedding for it. [inaudible] we need some information for both the parent and the children. We need to know about the presence of colored paths and the colors of the edges incident to the poles that are color patterns for the children. Everything is combined and used at the same time in order to greedily construct an embedding for an SPQR-tree node. The original algorithm doesn't do that. In particular, it doesn't exploit the pre-processing step to collect information on the incident edges. And for this reason he has to build the [inaudible] embeddings for each SPQR-tree node, in particular all the possible disjunctive embeddings, leaving to the parent selecting one. The algorithm is still linear since those bad things the same color patterns are equivalent. Our approach, however, do exactly one embedding per node, being more efficient. Moreover, by handling all constraints at the same time, we can early detect negative instances. Among the modification we have done to the algorithm is the part of the SPQR algorithm that deals with P-nodes. And that being a P-nodes basically means finding a good per mutation for the virtual edges composing the parallel. In this problem, it is possible to group together those edges having uniform color patterns, for example, these two. Grouping the virtual edges reduce the size of our P-node up to 8 virtual edges. That is a [inaudible] constant. And for this reason, it is possible to use bruise force to search for a solution. Basically these can be matched against both possible solutions until one is found. The set of possible solutions is the set of those per mutations yielding disjunctive and splitter-free embeddings and can be generated by an algorithm that starts from the decided result. Let's say that we want a per mutation for giving and embedding without specific pairs of color pattern. It must be noticed that given two color patterns there are several ways to align them, making them slide on each other in this way. For example, in this slide we can see three different alignments of the same pair of colored patterns. Let's concentrate on the first one on the left as an example. This slide shows how an aligned pair of color patterns identifies colored zones where virtual edges with the same color can be placed in order to obtain a permutation with a desired color pattern -- color patterns. And the algorithm that generally is the set of solutions simply tries solve the combinatorial possibilities of this process. The original algorithm performs a brute-force search as well, but it has among all possible permutations which although limited by a constant, there are still quite a lot. See a possible per mutation of 8 virtual edges must be tried. And all the possible flips of the virtual edges. We search in a much smaller space since we only match against good solutions being more efficient. Moreover, the set can be batch generated by an offline procedure making search very fast. Our approach can be generalized in three steps. First, the desired properties of the permutations are formalized in a specification. Then the specification can drive the generation of the set of all solutions. And finally, the instance is matched against all solutions until one is found. Now, we'll describe the part of the algorithm that finds the Eulerian tour giving the ordering for the nodes on the spine of the book embedding. Given -- let's assume that we have a disjunctive and splitter-free embedding for the graph. We construct the auxiliary graph by placing a node in each original face and by placing -- by placing directed edges in such a way that on each node they separate red edges from blue edges. In this way. Please note that the although the auxiliary graph records the [inaudible] graph it does not, since it all -- it also contains original nodes of the graph. The auxiliary graph obtain its Eulerian and for this reason it can be decomposed in cycles. Being an embedded graph the cycles are nested. And this hierarchy can be represented by a tree. Visiting the tree of cycles with a DFS visit clear gives Eulerian tour since each edge is visited just once. It can be non-self-intersecting as well. The reason is that on each node there is an outgoing and incoming edges are alternating the clockwise ordering of the node. And by following these alternates it is possible to avoid intersections. By following this principal, when passing from a father cycle to a child cycle that is nested in it, it is possible to have a tour that is non-self-intersecting. And now I will discuss the implementation of this algorithm. We implemented it in C++ language and used the two graph drawing libraries. GDToolkit which acts as the mainframe work of our application, and OGDF for its capability to build SPQR-trees in linear time. I said before that the P-node embedding part of the algorithm must perform a brute-force search. It is fast for a computer, but there are quite many cases for a human programmer. They are 180. And for this reason we develop the code generator that given the specification of the good per mutations it generalities the C++ code performing the brute-force search. And please note that the same algorithm also produced the table in this slide. And now the experimental part. This is the environment we use it. Basically a Linux platform with standards C++ language. In order to test for performance we need a lot of positive instances. This result we develop the graph generator. Actually a solution generator, since it generalities book-embedded graphs. The parameters of this generator are number of nodes and the number of edges. We use the generator to generate three sets of instances which have different edge densities. This is motivated -- we did this to have different kinds of SPQR trees. In each set the graph size ranges from 10,000 to 100,000 of nodes. And each value is averaged among five graph of all equal size. Now, a few charts showing the statistics on our graphs. It can be seen that when the edge densities increase, the number of connected components decrease. That is quite obvious. For example, in the last set we only have triconnected graphs since they are basically triangulations. The same things can be seen by SPQR-tree point of view. With the last set having graph with only R nodes in the SPQR trees. Those are our experimentation [inaudible] since the application runs linear time with our data set. To conclude, the results ever our work is an implementation of an algorithm to tackle the P2BE problem. And with it we did it by modifying and original algorithm by Hong and Nagamochi with modification aimed at increasing the performances with [inaudible] a part of the SPQR-tree algorithm the part that deals with P-nodes with a brute-force approach which uses code generation. What's more? In the future, variation of P2BE should be investigated because of its relationship with the other problems seen at the beginning. Variations should be investigated in order to try to model more case of SEFE. We have seen that P2BE models the case when the intersection star, the intersection graph is a star and one could try to generalize this result by modeling the case when there is a -when there are more than one star, a double-star, a caterpillar or a tree. That's it. Thank you for your time. [applause]. >> Host: Any questions for Marco? >>: [inaudible] actually don't understand why you need [inaudible] why you actually use ->> Marco Di Bartolomeo: Although it's not augmented, the [inaudible] implementation of SPQR tree is not very fast. It's not efficient. It's not linear. I think it's something like quadratic or cubic. Because when it was developed it's not aimed to be efficient. However, the GDToolkit we found it to be quite easy to use. And we are -- we were very used to it, so it work very well as a mainframe work for our application. And we mixed them together by translating SPQR trees build from GDF to the GDToolkit. >>: My question is -- and this problem [inaudible] you kind of search for a [inaudible] you search for embedding such that [inaudible]. >> Marco Di Bartolomeo: It's not for the dual graph. It's [inaudible]. >>: [inaudible]. >>: Have you also studied the problem that you don't [inaudible] vertex on the end has to be the same. Have you also studied the similar problem when we don't search for this circle but for a path because this could be related to double and [inaudible] recently? >> Marco Di Bartolomeo: Okay. Actually not, because we ->>: Because in this case you can have -- you can allow for the separators but only for two of them because the one separator when you start this Eulerian path and mark the separator which [inaudible]. >> Marco Di Bartolomeo: Yes, but [inaudible] models [inaudible] the only way. You have to start at the beginning and ends to the same point. >>: [inaudible] you don't have to restrict [inaudible] and essentially -- because this 2-page embedding is also relative to these two graphs where you have this [inaudible] Hamiltonian circle and this planar graph and [inaudible] Hamiltonian circle by Hamiltonian path. And the same happens [inaudible] Eulerian circle [inaudible]. >> Marco Di Bartolomeo: I see. That's something [inaudible]. >> Host: Any other questions? Okay. So let's thank you -- did you have a question? >>: [inaudible] the problem becomes if you could [inaudible] constraints on the [inaudible]. >> Marco Di Bartolomeo: Yeah. The original motivation for our work was the relationship between the SEFE problem and this kind of book-embedding problem. These result from Angelini, et al., in 1210, I suppose, showing how the SEFE problem with a connected intersection can be reduced to a SEFE problem with a three intersection. And this can be reduced to a book-embedding problem where the order of the nodes along the spine is constrained by a tree. And that's -- that's the reason for this last point. Yeah. It should be investigated. >> Host: More questions? Okay. Let's thank to Marco again for a nice presentation. [applause]