Floyd Warshall with Path Recovery leading to multithreaded all

advertisement
Floyd Warshall with Path Recovery
leading to multithreaded all pairs
shortest distance
Kevin Kauffman
CPS149s Fall 2009
Abstract
Introduction
The Floyd Warshall algorithm has been
around forever (or at least since Floyd and
Warshall got bored on a rainy afternoon a few
years back). The algorithm is a polynomial time
algorithm to compute the shortest distance
between all pairs of points in a set. So, basically,
if you gave it a map with a bunch of cities and
the distance between them, it would tell you
what the shortest distance between any two
were. This is superior to the more famous
Dykstra’s algorithm because Dykstra figured it
was ‘utter nonsense’ to calculate the distances
between all the points, because you could only
go from one city to the next anyway, you can’t
go to all of them. The hit on the Floyd Warshall
algorithm, though, is that it takes more time.
While Dijkstra is happily moving along at nlogn,
Floyd and his buddy warshall are hogging up the
slow vehicle lane running N^3. Too bad they
hadn’t invented HOV lanes, or the pair would
have been home free. So, anyway, once you run
your FW on your set of points, you have all
your distances. What do you do now? You know
that its 100 miles from New York to Philly, but
you have absolutely no idea how to get there.
The answer is Path recovery. I argue that with
minor modifications, you can run FW using
about twice as much memory, in the same
runtime complexity and be able to recover the
actual path which the shortest path takes
between any two points. Once this is complete,
I present how the idea can be extended to
create a multithreaded distance algorithm
which is totally awesome.
Floyd Warshall
The FW, as it stands now, is a way of
calculating the distances between all pairs of
points in a set. Seeing as there are n^2 paths, it
is quite amazing that you can calculate the
lengths of all those paths using only n^3
complexeity. Here’s how it works. So you start
with a nice adjacency matrix with each node
having an infinite distance to every other node
(except its neighbors because it would be sad if
your neighbors lived infinitely far away). So you
set your neighbors distances. You are now
ready to go. You start with an arbitrary node,
and for every pair of nodes, you check whether
the shortest path between those two nodes is
shortened when you add in the arbitrary node
in the middle. So it works like this. If A and B are
10 apart, A and C are 5 apart and B and C are 2
apart, and your arbitrary node is C, you look and
say, is the path from A to B through C than my
current path from A to B? If you are smart you
say ‘yes’ because 7<10. In this way, you
continue picking arbitrary nodes (really you just
choose nodes in order….so its not really
arbitrary, but it kind of is because it doesn’t
matter one bit if you pick them in order or not
so long as they are all picked) and checking all
pairs to see if their distance is shortened when
using that extra node. The code will be in the
appendix by the time I finish this, but I’ll explain
it anyway. Basically what you have is a triple for
loop iterating over all nodes in the graph. The
outer loop represents the picking of arbitrary
nodes, and the inner double loop is the
mechanism by which you are able to look at all
pairs of points under that arbitrary node. Once
this is done you have this amazing recurrence
relation which does all the work for you forever.
Path[j][k]=min(path[j][k], path[j][i]+path[i][k]);
Basically if J and K are the two endpoints of the
current path you’re looking at, and I is the
intermediate point, you say, which is shorter,
my current path, or the path if I go to I first and
then to k. You take the shortest one. Do this
n^3 times and you got yourself some shortest
distances. Once you’ve run through your n^3
comparisons, what’s left is your adjacency
matrix with the distances between all your
points. But still you’re left with this problem
that distances are great, but you don’t got any
directions. Imagine if google maps, when you
asked for directions from New York to Dallas
just said ‘1000 miles’ Pretty useless huh?
Network Routing
It would appear that is section has
nothing to do with FW, and you’re right. But it
gave me the inspiration for my path recovery
scheme. Basically with network routing, you
have yourself a bunch of these routers and
you’re trying to send data all between them.
Now of course, in the naïve case, when you got
a packet at D trying to go to V, the packet
knows what turn it should take at every router.
For this to work, though, you would either need
to run Dijkstra’s with path recovery every time a
packet came through, or you would need to
have a map of every path from every your node
to every other node, which undoubtedly takes
up a ton of space.
Thank goodness there are unnaive
people in the world to come up with good ideas
(like FW) so that we don’t have to endure the
travesty which would have come to pass had
each router in the world had to store a path to
every other router in the world. (Prof. Sorin
inserts “How did they cope with the storage
problem”) So, what we do instead is a sort of
dynamic routing. It’s dynamic in the sense that
the entire path isn’t known when the packet is
dispatched, but the path is figured out along the
way. What happens is that each router stores a
table of where to forward each packet based on
its destination. So if a packet is at D and wants
to go to V, the router looks up V in the table,
but instead of seeing the entire path, all it gets
is 1 node, the node which the packet should be
forwarded to. For completeness’ sake, we’ll say
it’s Q. It doesn’t really matter what letter in this
example, though, because its an example. But
once the packet gets to Q, it looks up V in its
table and sees that it should go to J next. This
continues till the packet ends up at V. (Prof.
Sorin inserts “Clever!” here) Now, of course,
you may or may not have realized how this
relates to FW, but that fact is imminently
irrelevant, as I’m about to tell you.
FW with Path Recovery
So as I’ve said like 3 times already, FW
has a drawback in that you can’t know where
you’re going, only how long it will take. But
we’re about to change that. The basic premise
is this: if through the calculation of the shortest
path distances, we can maintain a second
matrix of each where the mapping of [j][k]
represents the what node you must go to next if
you are at J going to K. In the case of network
routing, you only care about getting from 1
node to all other nodes. But with FW, all the
calculations are together, so you put all the
tables together and get your NxN matrix with
the values. Thus to recover a path, you do the
same thing the packet in our network did, If you
are trying to get to node V, you look up the
node you’re at, and then pan over to the V
column, that node is the next node in the chain.
Then you look up V from that node and obtain
the next node in the chain. You repeat this until
a node points directly to V, and then if you were
smart and were writing down the points as you
went, you have your path. With this method,
you can effectively store the path from each
node to every other node in only n^2 space,
which coupled with the adjacency matrix, only
doubles the total amount of space FW takes up
to 2n^2.
Now, you might be like ‘okay, we’re
done,’ but you’d be very, very wrong. So we
have this great matrix, but how in the world do
we update the values without increasing the
complexity? TRIVIAL!!! The answer lies in
cleverness. So we sort of use the n^3
comparisons from before and make them do
some extra work. (sort of) So here’s how it
goes. We start with our adjacency matrices with
most of the values infinite and the few
neighboring values filled in. Here we can start
filling in the path table (as I’ll call it from now,
or P-tabby for short). For each pair of
neighbors, before we run the algorithm, we
know that if you’re at the node, and you have a
path to it, the path is direct. So if A and B are
neighbors to start, in the adj. matrix we have
the distance between them, and at [a][b] in the
path table we have b, because the next node in
the path from a to b is b, and at [b][a] we have a
for the same reason. If this last sentence blew
your mind, please stop reading, because the
rest of the paper will blow much more. So we
fill in these trivialities and leave the entries for
which the path length is infinite as null. Now we
can actually start running the algorithm.
So with our recurrence relation at the
middle of the algorithm, there are two options:
a, the path we have is the shortest path still, or
b, there is a new shortest path through this
other node. The first case is trivial because
nothing changes in either the adjacency matrix
or in P-Tabby. The second case is more
interesting because things do change. The first
thing that changes is the distance in the
adjacency matrix is updated to be the new path.
Obtaining this value is trivial because you must
calculate it before the comparison anyway, and
it simply is looking up two values in the table.
The update in the path table is a little more
tricky. Therefore we leave this exercise up to
the reader (or the solution is outside the scope
of this paper, take your pick). Basically the value
stored in the P table can be the arbitrary node I,
it could stay the same, or it can change to some
other node. Figuring out which to do is at the
heart of the problem. So what you do is since
you know that the new shortest path is the
combination of the path from J to I and from I
to K, you really just look at the value in the p
table at [j][i] and copy it into the value of [j][k]
since you know you’re going to visit I, and your
shortest path is EXACTLY the path from j to I
and I to k, the first node on the path from j to k
must then also be the first node on the path
from j to i. In this way you update the path
table without any extra calculation, and since
the complexity of moving values around the
matrix is 1, your overall complexity is still n^3.
When writing the code, you have to be sure to
predicate correctly so as to avoid doubling your
runtime by adding a hidden second comparison
(perhaps in an if statement). Once again, the
code is in the appendix. I don’t know the figure
number yet because I haven’t made it yet. If
you can’t figure out that it’s probably the
second figure of code, then you do not deserve
the knowledge in this paper and you should
stop reading. Basically in the relation at the
middle of the code, you have to finagle a little
bit. Basically you eliminate the min statement
and put in an if then flow. If the distance
through I is less than the original distance THEN
change the distance to the distance through I
and replace the value in the P table. You don’t
do anything otherwise. For efficiencies sake, it
makes sense to calculate the sum of the path
through I before the if statement to avoid
having to do the addition twice. And then after
you’ve run the code, you have your path table,
your distance table, and you know how to
recover the paths. I’d say you’re done. (Prof.
Sorin inserts “This is one of those ideas I wish
I’d thought of.”)
Multithreaded Paths
Unfortunately I’m not an expert on java
thread libraries, so I’m not sure I could write the
concurrent program to do this (though I could in
C++), but this section is a discussion about how
to use the previous two sections on network
routing and path reconstruction to develop a
multithreaded implementation of a all pairs
shortest distance which could be optimized to
run on any number of cores. The heart of FW is
that distances can propagate out of the
arbitrary nodes to all other nodes as you move
through the for loops. In a multithreaded
scheme, instead of sequentially propagating the
values from each node to the next, you do them
all at the same time. So you have your
adjacency matrix to start, but once you split up
the threads, this matrix and the path table are
held individually by each thread, each
containing only information pertinent to that
particular thread. So thread a (representing
node a) only contains information about paths
through it instead of the entire table. The actual
adjacency matrix must still exist, though, as a
way of moving information between the
threads. It is also important that each thread
maintains information about its neighbors. So
here is basically how it works. You start with
your adjacency matrix, and then you create a
thread for each node. Each thread knows its
neighbors and the distance to them. It also
knows the node which starts the path to any
other node in the set. At start, it only has the
neighbors as nodes in the paths with the rest
being null, because it doesn’t have a path, and
the distances to the other nodes is infinite. At
each timestep, each thread forwards its
distance table to its neighbors. This forwarding
thought is slightly quaint and doesn’t
necessarily work in a coding atmosphere like it
does in a network solution. So what really
happens is that it takes its distance table and
copies the information into its location in the
adjacency matrix. It then pulls out of the
adjacency matrix the information of its
neighbors when it is needed. Then, it has a for
loop which iterates over each node in the
graph, I, then a for loop iterating over each of
the threads neighbors, n. Then we have a nice
relation
which
does
the
work.
Distance[i]=min(distance[i],distance[n]+n.distance[i])
Basically what this does is it says if the distance
from one of my neighbors to a node plus my
distance to the neighbor is LESS than my
current distance to the node, update my
distance. In this case, you would also update
your threads path table to know that to get to
node I, you must now go through node n
instead of whatever it was previously. As this
work is completing, you are constantly putting
new information in the adjacency matrix and
pulling the most recent information out as it is
used. With this method running on every
thread, the correct distances will propagate
throughout the graph till the path and distances
are known for each node.
This method is not only better because
it is optimized for parallel applications and
multicore processors, but it actually uses fewer
comparisons. With the single threaded FW, you
always compute n^3 comparisons. In the
multithreaded case, you have n threads which
iterate over all n nodes which then iterate over
its neighbors instead of all threads. This looks
great on paper, we reduced the comparisons to
some fraction of n^3. But, there is a problem.
Since we are running them all at the same time,
there is no guarantee that information will all
propagate in an ideal manner like in FW.
Therefore we must run each thread multiple
times until the values for each thread
completely stabilize. Therefore this method will
run much slower on a single core machine, and
likely will take much longer always unless you
have a large number of processors with lots of
cycles to waste. This is great in large systems
though because each thread only needs to be
aware of its neighbors, which in some instances
may be a win. If this was actually an academic
paper, I would implement both the single and
multithreaded schemes on a variety of cores
and compare the results for varying types of
graphs, but its 4:16 and this is due at 6, so it’s a
no go. Maybe in grad school.
Conclusions
Floyd Warshall with path reconstruction
is easy and a good thing to know in case it
shows up in a programming contest. Its such a
great idea that perhaps someone should put it
on Wikipedia.
Acknowledgements
I’d like to thank Owen Astrachan for
inspiring this paper. I probably wouldn’t have
written it had he not assigned it. I’d also like to
thank Dan Sorin for providing the interjections
throughout the paper, which he didn’t actually
provide but undoubtedly would have had he
read this paper. Lastly I’d like to thank Romit
Choudhury for teaching me about networks. I’d
lastly like to thank google images for the
abstract.
References
[1]
http://en.wikipedia.org/wiki/Floyd–
Warshall_algorithm
[2]
http://en.wikipedia.org/wiki/Dijk
stra's_algorithm
Appednix
public int[][] FW(int[][] adj){
int size=adj[0].length;
for(int i=0;i<size;i++){
for(int j=0;j<size;j++){
for(int k=0;k<size;k++){
adj[j][k]=Math.min(adj[j][k],adj[j][i]+adj[i][k]);
}
}
}
return adj;
}
public int[][] FWwPR(int[][] adj){
int size=adj[0].length;
int[][] p=new int[size][size];
for(int j=0;j<size;j++){
for(int k=0;k<size;k++){
if(adj[j][k]!=Integer.MAX_VALUE){
p[j][k]=k;
}
}
}
for(int i=0;i<size;i++){
for(int j=0;j<size;j++){
for(int k=0;k<size;k++){
int temp=adj[j][i]+adj[i][k];
if(temp<adj[j][k]){
adj[j][k]=temp;
p[j][k]=p[j][i];
}
}
}
}
return p;
}
Download