23704 >> Yuval Peres: Okay. So we're delighted to... long-term visitor and intern here, to tell us about optimization

advertisement
23704
>> Yuval Peres: Okay. So we're delighted to have Debmalya, a
long-term visitor and intern here, to tell us about optimization
problems and network connectivity.
>> Debmalya Panigrahi: Thank you, Yuval. Is the mic switched on?
Yeah. Thanks. So let's start with raising the history of
telecommunication and perhaps a desire for a large scale long distance
network started with the invention of the telegraph in the middle of
the 19th century.
So much so that by the early 20th century, there was already an
extensive network of sovereign cables connecting the world's
continents. Of course over the next 100 years as new technologies
developed new networks came up as well.
For example, the telephone network in the U.S. was fairly complicated.
By the middle of the 20th century. And today we have the modern day
Internet. But what connects all these networks is the desire to
connect individuals, which means a natural set of questions for any
such network is how good are these connections.
For example, how many cable failures can the U.S. to Europe's
connection survive in a telegraphic network or how do we increase the
capacity of a connection, say how do we increase the capacity, say,
between Washington or Seattle. Or in the Internet how many linked
failures can cause it to get disconnected into multiple pieces. These
are questions not limited just to communication networks but also to
many other kinds of networks.
For example,
networks and
all of these
entities and
connectivity
road networks or electrical circuits or even social
process workflows which are just virtual networks. So in
networks we have seen that the key desire is to connect
therefore the key questions are related to the
of these networks.
So instead of asking these questions specific to these applications,
can we abstractly define some of these questions so that we can propose
unifying solutions for all of these applications. And if you want to
abstractly classify connectivity questions, then there are two broad
classes of questions we come up with.
The first I call network analysis questions, which is the finding the
connectivity properties of existing networks. So I'm given a network
what is the minimum cut in this network or what is the sparsest cut?
What kind of flows can we sustain between two points in the network.
The second class of broad class of questions are what are called
network design questions, and here we don't have an existing network;
rather, we are given price network elements and nodes and edges and we
want to put them together to achieve some desired connectivity
properties.
So we want to achieve some desired rate of flow between two points or
achieve some desired robustness in the network. So these are the two
classes of questions I'll focus on today. In particular, I'll talk
about two problems or two groups of problems. For network analysis
questions I'll talk about minimum cut problems. There will be several
problems I'll talk about in this domain.
I'll also talk about network design and the problem of specifically
concentrate on are the Steiner tree problems, and these are two
fundamental problems in these classes.
And towards the end of the talk I'll also bring up a third class of
problems called cut sparsification, which is very closely related to
network analysis but does not quite fall in either of these two
categories. So that's the general plan of the talk we'll start with
minimum cuts and go to Steiner trees and we'll end with sparsification.
So let's start with minimum cuts. There are two kinds of min cut
problems that have received a lot of attention. One are called the
local connectivity problems. The other class is called global
connectivity.
In local connectivity, we are given a graph, and we are given two
particular vertices, two selected vertices the graph called terminals.
So here these terminals are denoted by S and T and the goal is to find
the smallest cuts that separate us from T. In this case, these cuts
are three edges each.
In global connectivity problems, or global min cuts, the goal is simply
to find the smallest cuts in the network. We are not given any
terminals. It's just overall what are the smallest cuts in the
network.
So of course both of these represent the fragility of the network to
failures. And since these are foundational questions and connectivity
there has been a lot of interest over the years. And many algorithms
have been proposed to find the local connectivity of a single vertex
pair.
This is connected to the max flow question, for example, by duality,
and similarly there have been many algorithms proposed to find a single
global min cut in the graph.
But what we will focus on in this talk are not these algorithms, but
how do we find the local connectivity of all pairs of vertices. So if
you think of any application, it's not feasible to say that I'll find
the local connectivity of Boston and Seattle today throw out my
computation completely, find the local connectivity of two other
locations tomorrow. Instead if it would be better if you could find
connectivity of all pairs in the graph and encode them in a data
structure from which we can easily query any particular connectivity.
It's similar to pairwise distances, for example, in a graph.
We often try pairwise distances for all pairs of vertices and encode
them in a data structure and query the data structure efficiently. So
this is the problem we will focus on for local connectivities.
For global connectivities, imagine that you want to increase the
robustness of the network. So if you want to do that, it's not
sufficient to just find one min cut.
If I increase the number of edges in one min cut there could be other
min cuts lurking around but still reduce the effectiveness of my
increase in robustness.
So what we would be interested in is can we find all the global min
cuts in the network. How fast can we find these two quantities. So
let's start with local connectivity.
As I said, the goal here is to find the local connectivity of all
vertex pairs. And this question was asked as early as 1961 when
Gomory-Hu came up with a very elegant data structure that showed that
all these N squared different min cuts, N squared vertex pairs, so N
squared local connectivities. All these N squared different
connectivities can actually be represented in linear space, go off to
space. So the question whether we can construct this data structure.
Whether we can construct this linear sized representation of all these
N squared min cuts in a graph.
>>: Talking of undirected graphs.
>> Debmalya Panigrahi: Yes. Only talking about undirected graphs
here, that's right. So that is the focus of the first part of the
talk. And there have been many algorithms proposed to construct a
Gomory-Hu tree or other data structures that find all the local
connectivities in a graph. The best running time at the moment is N
times N to the three halves, and I'll show you why, where this running
time comes from.
But if we look at all these sequence of algorithms over the last 40 to
50 years, we see something very surprising. All these algorithms are
based on exactly the same idea.
So take these set of N squared min cuts, identify some linear number of
min cuts that are critical, from which I can construct my entire
Gomory-Hu tree. And then once I've identified these min cuts use your
favorite local min cut algorithm to find these min cuts and put them
together in a Gomory-Hu.
All the previous algorithms exactly follow the same recipe. And this
is the recipe that follows. Where is the difference? Well, there are
differences in how you identify these min cuts and also differences in
how you find the min cuts once they've identified them.
So the basic algorithm for finding all local connectivities in a graph
has remained unchanged for more than 40 years. Which obviously begs
the question, is there something we are missing out, by focusing only
on this one recipe of finding Gomory-Hu.
So to answer this question, let's step back and try to see what a
Gomory-Hu tree is. So here's a definition. It's a weighted tree.
the same set of vertices as the input graph.
On
So let's say here's an input graph on the left, I construct a weighted
tree on the same set of vertices, and it must have the following
property: If I look at any pair of vertices, let's say S and T, then
if I go to the tree and look at the min cut separating S from T, this
min cut on the tree is very simple. It's simply the light test edge on
the path connecting S to T. So here's a min cut that separates S from
T in the tree.
I take that min cut, bring it back to my graph. The graph and the tree
are on the same set of vertices. The cuts correspond to each other.
And this cut should satisfy two properties. One, it should be a min
cut separating S from T and the graph as well. And two, the number of
edges in this cut should be exactly equal to the weight on that edge in
the tree.
Which means that the tree really represents all pairwise connectivity
values, for any pair of vertices from the tree we can easily find the
local connectivity of that pair.
So this is a Gomory-Hu tree. How do we construct it using local min
cuts? It's actually a very simple algorithm. So initially what we do
is we select any arbitrary pair of vertices, say S and T. We find a
local min cut for that pair. So here is a min cut.
That identifies an edge in the Gomory-Hu tree for us. Corresponding to
that min cut, we get an edge whose weight is exactly equal to the
number of edges in that cut. Once we have gotten that edge, we know
that each side of that cut will be separated by the edge in the tree as
well.
So once we've identified that edge, we can recurse in the following
manner. What we do is we create two instances of the problem, where in
each instance we retain one side of the cut and contract the other
side. So here, for example, we create two instances. So if you go
back, you'll see that in one instance I've contracted the bottom half
of the cut. In the other instance I've contracted the top half of the
cut. And now I recurse on these individual instances by, again,
picking two vertices on the side I did not contract and repeating this
until I get all the N minus 1 edges in the Gomory-Hu tree. Which means
that really what I am doing is N minus 1 local min cut computations.
Every computation reveals an edge in the tree. N minus 1 edges.
The best algorithm to find a local min cut takes time N times square
root of N and that's where the N times N to the 3 whole running thing
comes from.
So this is what the state of the art is. But, again, is there
something missing here? And one thing that this algorithm is ignoring
or this -- yeah.
>>: [inaudible] before the work of [inaudible].
>> Debmalya Panigrahi: No, the problem is that this entire recipe only
works if it's an exact algorithm for min cut. If it's an approximate
algorithm, then once you contract, the min cuts change. If it's an
exact min cut algorithm, you can do this contraction, otherwise you
can't. And Modral's algorithm is only for -- only getting you an
approximate min cut. It doesn't get you an exact one.
So the approximations basically build up on the recursion tree. So if
we look at the comparison, a comparison between local min cut and
global min cut algorithms, then until the early '90s when these
algorithms were developed, it was thought that local min cuts are
easier. In fact, all the algorithms for local min cuts were faster.
Global min cuts were typically found by local min cut computations.
But things have changed over the last 20 years.
In fact, now we can find a global min cut in approximately linear time,
whereas local min cuts are still quite far from linear time. So while
the idea is perhaps in this recipe we can change the local min cut
computations to global min cut computations.
Can I change the local min cuts to global min cuts? What happens?
Well, here's a graph. I find a global min cut in the graph. It does
identify an edge in the Gomory-Hu tree for me. So I can use the same
idea. This identifies an edge. I contract the two sides to create two
instances and recurse.
The problem, however, isn't the recursion. What happens in the next
stage? Well, we are finding global min cuts. So we keep finding the
same min cut. And we don't make any progress whatsoever.
Now, the problem is that we are by insisting on finding global min cuts
in the recursive stages, we are not being able to find cuts that split
the part of the graph that I did not contract.
Because I have to find small cuts but cuts that split the part of the
graph I did not contract. So to do this, we introduce a new problem
called a Steiner min cut problem, which finds the smallest cut that
separates a set of terminals. So I give you a graph. I also allow you
to give me a set of terminals in the graph which is a set of vertices.
And I want the smallest cut which splits these terminals into two
pieces. So if you replace global min cuts by Steiner min cut
computations ->>: [inaudible].
>> Debmalya Panigrahi:
Sorry?
>>: How is this different from multivariate?
>> Debmalya Panigrahi: Multivariate you get many pairs. You want each
pair to be separated. Here I'm not enforcing any conditions on how
this set gets separated. I only want this set to be separated.
>>: Only want to partition its two halves not necessarily ->>: You just want one-half.
>> Debmalya Panigrahi:
At least one.
>>: Okay.
>> Debmalya Panigrahi:
sides, yes. Okay.
I want at least one terminal on each of the two
So if you replace global min cuts by Steiner min cuts what happens?
Well at the top level, at the first level of recursion, I'll define my
terminals as the entire set of vertices. So it's just a global min cut
at that point.
At the next level, what I do is I define my set of terminals as the
size I did not contract. And then I made progress, right? Because the
new cuts I get actually split these sites.
So this gives me a correct algorithm after N minus one iterations I'll
get a Gomory-Hu tree. What's the runtime? Well, one problem is that
if you notice carefully, this notion of a Steiner min cut generalizes
what local and global min cuts. If I define my terminals as the entire
set of vertices it's a global min cut. If I define my set of terminals
as just two vertices that's a local min cut. So clearly I can't hope
to be bartering the runtime of a local min cut computation here. In
fact, the best algorithm we could come up with for finding a Steiner
min cut takes more time than a local min cut. In fact, it takes time M
times N.
So have you made any progress? We have a new algorithm. But really
we're losing in the runtime. Now comes the key aspect of Steiner min
cuts. And this is that it gives an advantage not just that it gives an
alternative approach, but it gives a very key structural advantage over
local min cuts.
So if you think of a local min cut computation what does it do? It
essentially computes a flow between S and T. So a flow is just a
collection of parts. Now, once I've computed a set of parts between S
and T, in the next step, when S and T gets separated, these parts are
completely useless. So I have to start from scratch again.
What a Steiner min cut does is it computes a set of trees. It packs
trees rooted at a vertex S. S goes to one of my two sub problems. All
these trees go with this and can be used without any change whatsoever.
So there's a -- even though we are doing N minus one Steiner min cut
computations there's significant overlap of work that I'm doing at
these various computation loads. In fact, if you do an amortization
argument which can show we get an algorithm that runs in M times N,
which is the same as just a single Steiner min cut computation. The
whole entire N minus 1 Steiner min cut computations moves this to one
single computation.
This improves the runtime for finding all the local min cuts. In fact,
it turns out that this is optimal provide a certain cut conditions.
And this was true by Albert and Saul. So a quick recap. There are two
things here. One that local min cuts are not necessary for computing
Gomory-Hu trees and can be replaced by Steiner min cuts.
Two, even though a single Steiner min cut computation is slow, slower
than a min cut computation, a sequence of Steiner min cut computations
is much faster than a sequence of local min cut computations. And
that's where we get a new algorithm.
So having looked at a local connectivity ->>: Conditional?
>> Debmalya Panigrahi: So either this algorithm cannot be improved or
a certain dynamic cut condition is false, which is hypothesized to be
true.
>>: [inaudible] so the runtime cannot be improved.
>> Debmalya Panigrahi:
Yes.
>>: So you should improve that to improve something else?
>> Debmalya Panigrahi: You disprove some other hypothesis that is
generally believed to be true.
>>: So you use essentially the sequence of the Steiner vertices as a
change in something; is that ->> Debmalya Panigrahi: No, I didn't get you. So the sequence -- so
you have a set of terminals. In the next set you get that set of
terminals splits into two sets of terminals.
And the fact that you chose one vertex to root your trees at at the
pair and is useful in the one where the pair encodes.
All right. So having looked at local min cuts let's move on to global
connectivity. As I promised you, the problem we would be looking at is
finding all global min cuts.
How many global min cuts are there in a graph. For all we know there
are exponentially many min cuts. There are exponentially many min cuts
in a graph.
So the first question here is how do we even count the number of cuts
and if we can count them how do we represent them?
It turns out that one can show that there are only quadratic number of
global min cuts, and not only that, that all of these cuts can be
represented as in the local connectivity case in just linear space.
And this is called a cactus. This is a data structure that has been
around for 30 years now.
So really the question is how do we construct a cactus? And there have
been many algorithms, again, over the last 30 years, the best runtime
at the moment is quadratic and the number of vertices.
Again, exactly as in the local connectivity case, if we look at all
these algorithms, they have the same recipe. I give you the input
graph. You somehow list all the min cuts in the graph and then put
them back into a cactus. Now, this listing could be very succinct.
There could be very succinct listings, each min cut might be found by a
very efficient algorithm. But this intermediate step of listing all
the min cuts in all of these algorithms. Now, is there something
specific about the quadratic runtime in the sense if we want to improve
the running time of an algorithm for finding all global min cuts, can
we still follow the same recipe and make changes to how we list the min
cuts or how we find min cuts or is the entire recipe going to get stuck
at quadratic.
In fact, it turns out that there's a significant fundamental barrier at
quadratic run times, and it comes from the following simple algorithm.
Sorry, simple example. So here's a cycle, a cycle has just [inaudible]
so it's just linear number of edges but how many min cuts does the
cycle have. I remove any two edges that's a min cut for me. So it has
N edges. But N square min cuts. Now, if it has N squared min cuts,
then no matter how I list these min cuts, how I find them, I will incur
at least N squared time in the intermediate fields, as long as I'm
listing min cuts I'm in trouble. So really if you want to improve the
runtime of finding all global min cuts in a graph, then we have to
somehow get rid of this intermediate step.
And the key structural property we prove is that it's not all these N
squared min cuts that are equally important. There exists a subset of
linear number of min cuts so of N min cuts that encodes the structures
of all the min cuts in a graph. This is the key structural property
that helps us in reducing the runtime from N squared.
In fact, once we have this property, the next question obviously is if
we can identify these min cuts how do we find them? It turns out again
the same technique as for Steiner connectivity works, and this is using
tree packings. Now, there are many algorithms for tree packings, one
algorithm that really works out well is something that I worked on in a
separate project, it does an [inaudible] N graphs but that aside, when
we put these together to get the first linear time algorithm for cactus
construction.
Or the first linear time algorithm to find all the min cuts in a graph.
And, of course, this cannot be improved except for logarithmic factors.
We have to look at every edge at least once to find min cuts.
So this brings us to the end of the first part of my talk on network
analysis and I'll pause for questions. All right. So let's move on.
Let's move on to network design now. And here I'll focus on the
Steiner tree problem. So what's the Steiner tree problem? It's the
most basic problem in any network design context. So I'm given a set
of locations. Let's say some Microsoft Offices. And I want to connect
them in a network. What's the cheapest way to connect them?
So more formally, we'll look at online Steiner tree problems. So all
the terminals are not given to me in advance, but they come online.
And how do we augment our Steiner tree to maintain optimal cost?
So here's a formal definition of the problem. We are given undirected
graph off line, and this graph has edge and node costs. So every node
has a cost, every edge has a cost. Online we get a sequence of
terminals. We get a vertex and another terminal and so on. These are
vertices in the graph.
Now when terminal T sub I arrives we have to connect it to the previous
terminals by augmenting the graph we have already built. So here's
where the algorithm is online. We can't throw away something we
already bought.
And, of course, the objective is to minimize the cost. So to
understand the problem and to see a very simple first attempt, here's a
greedy algorithm for -- by the way, I should mention that edge costs
can be generalized by node costs. If an edge has cost C, I can always
replace it by two edges with a node having cost C connecting them.
This is without loss of generality. I'll only talk about node cost
from now on.
>>: This increases the number of nodes by load.
>> Debmalya Panigrahi: It makes it M plus M. But here we will get
algorithms which are polynomial at best. I mean, these would be -- the
main focus will be on the competitive ratio, not on the running time.
All right. So here's a very simple algorithm for the problem. It's a
greedy algorithm. I get a terminal. At this point I don't need to do
anything. There's no constraint. I get a second terminal. I buy the
cheapest path connecting my two terminals.
In this case it's a path containing the orange vertex and has cost one.
I get a third terminal. I again buy the cheapest path connecting it to
previous terminals.
Again I incur a cost of one. If I keep doing this for N terminals,
then overall I have a solution which applies all these orange vertices
and has a cost of N. On the other hand, if you see the optimal
solution, that simply buys the blue vertex and has a cost of two. So
clearly this greedy algorithm does not work well. But the surprising
fact is that this is the best algorithm that was known for the problem.
There was no sub polynomial algorithm, competitive algorithm known for
the online Steiner tree problem. So let's look closely at this
algorithm. What's going on?
Well, for every choice
greedy path connecting
optimal algorithm also
the current algorithm,
that the greedy algorithm makes, which is a
the current terminal to a previous terminal, the
makes an alternative choice. It also connects
the current terminal to a previous terminal.
And the way we can analyze the cost of the greedy algorithm is by
simply summing up the costs of the optimal algorithm. Of the choices
that the optimal algorithm is making.
Now, the sum of these costs is at most row times the optimal cost, then
we can claim that the cost, the greedy algorithm is row competitive,
because on every row the greedy algorithm pays less than the optimal
algorithm. Now, the key thing here is that when we sum the cost of the
optimal choices, then even if a vertex appears on 10 of these paths, 10
of these optimal paths, we are to count the cost of the vertex 10
times.
Because on the corresponding greedy choices, we have no guarantees of
overlap. So this is when we are saying with some of the optimal costs,
this is just a naive sum, even if the same thing appears multiple
times, we sum it up.
Now, if we use this recipe for the greedy algorithm that we should,
what happens? So if we look at the optimal paths, these are the red
paths, these are the choices the optimal algorithm makes.
Now, the same blue vertex appears on all of these paths. So why the
optimal cost is just two, when we sum up all these optimal paths, the
cost becomes 2 times N. Because as I said even if it's the same vertex
since it appears on all of these paths, we have to sum it up N times.
And that's where the competitive ratio becomes omega of N. But now if
someone gave me the liberty of taking out a vertex from the optimal
paths. So I have to sum up the optimal paths but I'm allowed to do it
after removing exactly one vertex from each path. If someone gave me
this liberty, then in this particular example I'm in good shape.
Because on all of these red spots I will pluck out the blue vertex,
vacuously all the parts now sum up to some small value, in fact 0 in
this case. And therefore the cost of the greedy algorithm is small,
provided we are allowed this extra liberty.
In fact, very ->>: [inaudible].
>> Debmalya Panigrahi: So I'm saying that instead of the greedy
property, if we make it a slightly weaker property, where on all these
greedy paths we don't need to sum up the costs of the paths, but we can
identify a vertex on the path so that I'm not going to sum the cost of
this vertex. For everything else, I will have to sum up the costs.
It's exactly one vertex that I'm allowed to remove.
>>: It's not that you're taking it out of the pack it's just that
you're not counting.
>> Debmalya Panigrahi: I'm not counting the cost in the lemma. So in
this particular example, it works out. Now, very surprisingly this
works out in general. And here's the lemma. For any sequence of
terminals there always exists a set of paths and you think of these as
the optimal paths such that path P sub I connects terminal sub I to a
previous terminal and has the following properties.
If there always exists
you remove that vertex
path sums up well. So
is at most log N times
one vertex on each of these paths, such that if
from the path, then the cost of the remaining
in fact the sum of cost of the remaining paths
opt.
Remember that if you were not allowed to take P sub I out of the path
then this sum would have been N times opt. But if just taking out one
vertex from every path we bring it down to log N times opt. And, of
course, these are vertices on an optimal, in an optimal solution. And
therefore the cost of these vertices overall is at most optimal. When
we are not double counting. Same vertex V sub I for 10 different
paths; we just take it once.
We call this the almost greedy property, because it's just one vertex
we have to remove. Now, what is this property gaining us in terms of
an algorithm? Well, think of in general when I get a new terminal I
have exponentially many choices of how I connect it to a previous
terminal.
If I had the greedy property, then when I get a new terminal I only
have one choice. It's the cheapest path to connect to a previous
terminal. Here I'm sitting somewhere in between. If I can identify V
sub I, then really I have just one choice. I will reduce the cost of V
sub I to 0 and take the cheapest path.
But how do I identify V sub I? I don't know V sub I in advance, which
means that I have N paths now which sits somewhere between the
exponential number if you had no properties and the one exclusive path
if you had the full greedy property. So for every terminal we now have
N choices rather than one choice.
And this lets us reduce the online Steiner tree problem to the online
nonmetric facilitation problem. If you know this problem, you can
guess what's going on. The facility is this one choice that I make.
There are N facilities. I make a choice of a facility. Once I have
made that choice, the connection cost is fixed for me.
It's not exponentially many choices but just polynomially many choices.
And there were algorithms already with this problem, the online
nonmetric facility location problem, which let's us guess the first
polylog competitive algorithm for the online Steiner tree problem.
And in fact this algorithm is optimal up to a log K factor where K is
the number of terminals. So moving on -- oh, before I move on, I want
to give you a very quick proof of how this property, of the property
that I showed, the almost greedy property. And here's a proof. So I
define a spider as a tree where at most one vertex has degree greater
than two. We call this vertex the head of the spider.
All the leaves are called the feet of the spider, and a path connecting
a head to the foot is called the leg of the spider. So here I'll prove
the entire property in just one slide. And here's the proof.
We look at the optimal tree. This is the optimal Steiner tree. On
this we first identify the vertices which are nonleaves but are
terminals. Now, these are be replaced by two vertices, one nonleaf,
nonterminal vertex, having the same cost as the original vertex.
And then a dummy terminal vertex which is cost of 0. And this
replacement is without loss of generality. Now, once I have made this
replacement, I do a recursive decomposition of the tree as follows. I
find the spiders at the lowest level of the tree.
So these are spiders at the lowest level. I look at a subsequence of
terminals on any one such spider. So T 1 followed by T2 followed by T
7 gives us subsequence on the first spider.
Then I define my paths from each of these terminals, except the first
one in the subsequence to the immediate predecessor in subsequence. So
T sub 7 goes up to the root and goes down to T sub 2. The vertex that
I will pluck out of the path I define that as the root of these
spiders.
Once I have done this, I have gotten paths for T sub 2, 7 and 4, I
remove these from my tree and then I simply recurse.
I go through two more recursive levels and now I've got paths for all
terminals except for the first one, and on all of these paths I've
identified these vertices to pluck out.
So at this point I've identified the paths in the vertices. Now, do
these paths sum up well if the vertices were removed? Well, yes they
do because they're at most log N levels. And on each level every
vertex except the root that I plucked out appears on at most two paths,
one going down and one going up. Which means that the total cost over
all of these paths is at most log N times or twice log N times opt.
So that's the full proof. Now, let's move on to a slight
generalization of Steiner trees. The first general -- I'll talk about
two generalizations. Both generalizations have the same basic
structure. So instead of considering a monochromatic set of terminals
we split the terminals into multiple groups. Here are two groups of
terminals.
Now, the constraint based on this grouping of terminals is different in
the two problems. In the first problem, called the group Steiner tree
problem, the goal is to connect a representative terminal from each
group. From each group I'm allowed to select any terminal and then
these representatives need to be connected in the cheapest possible
manner.
In the Steiner forest problem, we have to connect terminals internal to
a group but don't need to connect terminals across groups. So, for
example, whereas this one orange vertex is a solution, is a valid
solution for group Steiner tree, it connects one red terminal to one
purple terminal, it's not valid for Steiner forest because it's not
connecting anything internal to groups.
So for Steiner forest, these two orange vertices are a valid solution.
So these are two very standard generalizations of the Steiner tree
problem. And we also give online poly log competitive online
algorithms for these problems, except that the only catch being that
these algorithms are quasi polynomial time, whereas the previous
algorithm, the algorithm I showed you in detail, the online Steiner
tree algorithm, is actually polynomial time.
So now let me move on to a slightly different conceptual generalization
of Steiner trees. And these are called network activation problems
which I introduced recently. If you look at a Steiner tree problem,
then one way of looking at it is the following.
At every vertex we have two choices. Either we pay a cost of zero or
we pay the cost of the vertex. Now, if we look at any particular edge,
I get the edge, only if I choose to pay the cost of both the vertices
as if at two ends. So this is just a different view of the problems we
have been talking about. There are two choices that are vertex and
edge gets activated provided I pay the cost at the two ends.
Now, in many situations things are slightly more general. For example,
in choosing powers of, power levels of vertices and things like this.
Instead of having two choices, there are multiple choices that are
vertex. And when is an edge activated? Well, it's dictated by a
general activation function which maps the choices I make at the two
ends to whether the edge is active or not. And the only constraint I
have is that it should be monotonic. If I decide to pay more the edge
shouldn't vanish.
So under this much more general model can we solve Steiner tree
problems or other problems, other network design problems. In fact, we
show that a sum very carefully chosen greedy algorithms can achieve
logarithmic factors for a variety of design network problems including
Steiner trees but also higher connectivity such as biconnectivity
problems so on and so forth. So is this natural question, is this
model redundant? Is it exactly the same in computational power as the
standard models that we have?
But that is refuted by showing that for the minimum spanning tree
problem, in fact in this model this problem is log N hard. It's NP
hard, in fact log N hard whereas in all standard models that we know of
this problem is very easily solvable in polynomial time. So there is a
separation and the separation is represented by just one problem.
Of course, these are just the tip of the iceberg. There are many other
problems one can explore in this framework, and there has been some
follow-up work looking at some other problems. But a lot still needs
to be done.
So that is all I have to say about network design problems, and, again,
I'll pause for a short break, if there are any questions. All right.
So let's move on to the last section of the talk, and this is about cut
sparsification.
So what is the cut sparsification problem? Well, the idea is if I'm
given a very dense graph, can I sparcify it? So can I make it, for
example, if the graph initially has roughly N square edges, can I
reduce the number of edges to roughly linear in number of vertices?
And then for every edge that I retain in the sparsifier, I will make it
a thicker edge. So I'll put a weight on it. In order to ensure that
for every cut the weight of the cut is approximately preserved in the
sparsifier. So I reduce the number of edges, make the edges thicker,
such that is the weights of all cuts are preserved. So, of course,
there's some combinatorial interest in this problem because it's not
clear a priority that such sparsifiers even exist. But on top of that
it also has a significant algorithmic implication. So in almost all
cut algorithms, for example minimum cut, sparsest cut, max cut, et
cetera, running time depends on the number of edges.
This gives us a handle that can reduce the running time from depending
on a number of edges to the number of vertices, by sparsifying and then
running the algorithm. Except that in some cases we have to settle for
an approximation whereas the problems were potentially exactly solvable
in polynomial time.
Okay. So this gives us a handle to trade off accuracy of the algorithm
with running time.
How would we sparcify a graph? Well, the most obvious technique would
be to simply uniformly sample all edges. We sample every edge at a
probability which is dictated by how much we want to reduce the size of
the graph. And the selected, if an edge is selected, we simply give it
a weight of 1 over the sampling property.
So in expectation every cut is what it was earlier. Does this work?
Well, not quite. Because of what are known as dumb bell graphs. So
you imagine you have a graph where you have two complete portions but
that are connected by a single thin edge. If you sample all the edges
at rate 1 over N the graph for all you know gets disconnected. It
almost always gets disconnected because the single edge is being
sampled at a very small probability.
So the natural fix is that this edge, the single edge that connects the
two heavy portions, must be sampled at a high probability, and the two
portions and the sides should be sampled at low probabilities.
To put this in the formal language, the probability of sampling an edge
should be dictated by the size of the smallest cut containing the edge.
Or, in other words, the smallest cut that separates the endpoints of
the edge, which is the local connectivity of the edge.
And in particular, we should not uniformly sample edges but the
probability of sampling an edge is inversely proportional to its local
connectivity. If an edge has small local connectivity, for example,
the connecting edge here its probability is high. If it has low local
connectivity such as edges in the two complete portions then its
probability of sampling is low. Of course we want to make this
unbiased instead of giving weights to every selected edge we'll give it
a weight according to this probability of sampling.
So the expected weight of the edge remains one. So this is a scheme
for sparsification. Does this work? Well, there are two things that
we need to check. First, does it even produce a sparse graph? For all
you know all the edges are retained.
It turns out that this is easy to show. You can show that the sum of
[inaudible] of local connectivities of all edges in a graph is at most
N or N minus 1. Which means that the graph we are getting after the
sampling process is in fact sparse. But the trickier question is the
weight of every cut approximately preserved.
So as we saw in expectation of recursive preserve, how would
concentration evolves. And this is a question that was asked in the
original seminal paper of Benser and Cargill when they introduced
sparsification and remained open for many years.
Of course, there was work on sparsification in these 15 years when
people showed that instead of using local connectivity, if you use
slightly more artificial parameters such as edge strengths or effective
conductances, you would in fact get sparsifiers.
But the original question stayed open until we showed recently that in
fact this is true. So the conjecture was true, if you sample every
edge by inverse of local connectivities that works. And it's not just
for intellectual curiosity, but this actually proves the previous
theorems as well.
I should mention here that effective conductance sampling also gets
stronger properties, which we don't get. But for cut sparsification,
the theorem that we show actually implies both the previous theorems
and brings them under the same umbrella. The two previous results were
incomparable. Also, this leads to better sparsification algorithms.
In particular, we get the first linear time algorithm for cut
sparsification.
Recall that one of the uses of sparsification was to use it as a
preprocessor and then run other cut algorithms. Now, of course, if the
algorithm itself is not efficient, then you can't have this recipe of
using it as a preprocessor. That will become the bottleneck.
So it's important to get sparsification algorithms that are efficient,
and we get one that runs in exactly linear time.
>>: [inaudible].
>> Debmalya Panigrahi:
that. Several logs.
It was linear and there were many logs after
>>: Do you essentially compute the numbers?
Or is that basically --
>> Debmalya Panigrahi: No, if you want to explicitly compute the
language then we have to construct the Gomory-Hu tree which would take
N times N times. So instead of that, we use some structural insight
from the proof to implicitly construct a different set of probabilities
that also give us sparsifier.
>>: So in the end are you sampling with lambda E or the algorithm or
sampling a different ->> Debmalya Panigrahi: Sampling with a different set of properties.
Really what we want is we need to sample with probabilities such that
the probability sum up to something small. And we get this
concentration bound.
So really it's a one-sided bound that we want. We want the probability
to be at least 1 over lambda E. But as long as the sum is small we're
happy to have higher probabilities. So we explored that. The
probabilities are at least 1 over lambda E, but our individual
properties might be higher than 1 over lambda E.
All right. So I want to end with a general overview of my research.
As I showed you, I have worked on several problems in graph
connectivity. I'm also interested in online and stochastic problems.
Problems where the input is uncertain. And I worked on some modeling
issues in online problems. I've also worked in specific optimization
problems in the online framework, for example, load balancing and
matching, these are allocation resource problems, and also on more
applied problems such as news feed selection in social networks and so
on.
This also overlaps with Web-based applications for which I have also
worked in some search algorithms. And also in some networking
algorithms, for example, for long distance Wi-Fi networks, adaptive
channel networks. Monitoring and P-to-P networks. So this is sort of
the general structure of what I work on.
Some of the graph connectivity and online stochastic problems are more
theoretical. Some of the parts, other parts are more applied. So
these are all algorithmic applications of specific systems.
All right. So I want to end with a couple of favorite problems. So
here is one. And these are very concrete problems. So I have been
telling you from the beginning that global min cuts are easy to
compute. In particular we have linear time algorithms. But I've been
cheating a little. This is true if you're happy with the Monte Carlo
algorithm. If you don't want a certificate that that algorithm is
correct. But if you want the certificate the runtime becomes much
worse. So we really don't know how to efficiently certify min cuts in
a graph. And this is one algorithm I would like to work on. Another
important question that we don't know anything about is capacity to
network design. In reality network links -- yeah?
>>: [inaudible] that produced [inaudible].
>> Debmalya Panigrahi: But, yeah, well not explicitly perhaps but by
certificate I mean it certifies correctness.
>>: And nothing else [inaudible].
>> Debmalya Panigrahi:
Well, but still there has to be a proof.
>>: The fact that it terminates, it's a certificate where you guys ->>: The algorithm [inaudible].
>>: The algorithm.
>>: [inaudible] sunset like problematic, something that [inaudible].
>>: If you want zero error ->> Debmalya Panigrahi: I mean, this Monte Carlo algorithms would have
a slight probability of error and I mean one way of certifying it is to
say that if we have a smaller it will be able to run something else.
Still we're certifying it somehow.
Okay. The other problem I'm interested in is capacity data network
design. So in reality network links and nodes not only have costs but
they also have capacities. But in all classical network design
questions, such as Steiner trees, capacities are completely ignored.
Even for this very simple, apparently very simple question, I give you
a graph. I give you a pair of terminals. What is the cheapest
subgraph that can support a particular flow between these terminals?
And we don't know the answer. We don't know any sub polynomial
approximation for even this very simple-looking problem. More broadly,
I'm interested in exploring combinatorial structure of graphs to
develop better algorithms for fundamental problems. And one thing I
want to emphasize here is I think simplicity of an algorithm is also a
virtue that should be possible to trade off with things like more
quantitative virtues like running time and, but quality of
approximation, competitive ratio and so on and so forth. So everything
I've shown, for example, are very simple algorithms. There's nothing
complicated going on at all.
I'm also interested in tech transfer between algorithmic theory and
applications, can we use this entire toolkit we've built over 30 or 40
years to solve various application oriented problems? And that's it.
Thank you.
[applause]
Download