Fast Lowest Common Ancestor Computations in Dags Stefan Eckhardt, Andreas Michael Mühling, and Johannes Nowak Fakultät für Informatik, Technische Universität München, Boltzmannstraße 3, D-85748 Garching, Germany {eckhardt,muehling,nowakj}@in.tum.de Abstract. This work studies lowest common ancestor computations in directed acyclic graphs. We present fast algorithms for solving the A LL -PAIRS R EPRE SENTATIVE LCA and A LL -PAIRS A LL LCA problems with expected running time of O(n2 log n) and O(n3 log log n) respectively, where the expectation is taken over a distribution of input graphs. The speed-ups over recently developed methods are achieved by applying transitive reduction on the input dags. The algorithms are experimentally evaluated against previous approaches demonstrating a significant improvement. On the purely theoretical side, we improve the upper bound for A LL -PAIRS A LL LCA to O(n3.3399 ). We give first fully dynamic algorithms for both A LL -PAIRS R EPRESENTATIVE LCA and A LL -PAIRS A LL LCA. Here, the non-trivial update complexities are O(n2.5 ) and O(n3 ) respectively, with constant query times. 1 Introduction Causality systems or other kinds of entity dependencies are naturally modeled by directed acyclic graphs (dags). Thinking of causal relations among a set of events, natural questions come up, such as: which event entails two given events? What is the last event which entails two given events? Transferring these questions to dags, answers are found by computing common ancestors (CAs), i.e., vertices that reach via some path each of the given vertices, and computing lowest common ancestors (LCAs), i.e., those common ancestors that do not reach any other common ancestor of the two given vertices. Although LCA algorithms for general dags are indispensable computational primitives, they have been found an independent subject of studies only recently [7,6,19]. There is a lot of sophisticated work devoted to LCA computations for the special case of trees (see, e.g., [15,22,6]), but due to the limited expressive power of trees they are often applicable only in restrictive or over-simplified settings. There are numerous applications for LCA queries in dags, e.g., object inheritance in programming languages, lattice operations for complex systems, lowest common ancestor queries in phylogenetic networks, or queries concerning customer-provider relationships in the Internet. For a more detailed description of possible applications, we refer to [6,5]. The algorithmic variants studied in this paper are: A LL -PAIRS R EPRESENTATIVE LCA : Compute one (representative) LCA for each pair of vertices, and A LL -PAIRS A LL LCA : Compute the set of all LCAs for each pair of vertices. Related Work. LCA algorithms have been extensively studied in the context of trees with most of the research rooted in [1,24]. The first asymptotically optimal algorithm L. Arge, M. Hoffmann, and E. Welzl (Eds.): ESA 2007, LNCS 4698, pp. 705–716, 2007. c Springer-Verlag Berlin Heidelberg 2007 706 S. Eckhardt, A.M. Mühling, and J. Nowak for the all-pairs LCA problem in trees, with linear preprocessing time and constant query time, was given in [15]. The same asymptotics was reached using a simpler and parallelizable algorithm in [22]. Recently, a reduction to range minimum queries has been used to obtain a further simplification[6]. More algorithmic variants can be found in, e.g., [8,26,25,9]. In the more general case of dags, a pair of nodes may have more than one LCA, which leads to the distinction of representative versus all LCA solutions. In early research both versions still coincide by considering dags with each pair having at most one LCA. Extending the work on LCAs in trees, in [20], an algorithm was described with linear preprocessing and constant query time for the LCA problem on arbitrarily directed trees (or, causal polytrees). Another solution was given in [2], where the representative problem in the context of object inheritance lattices was studied. The approach in [2], which is based on poset embeddings into boolean lattices yielded O(n3 ) preprocessing and O(log n) query time on lower semi-lattices. The representative LCA problem on general dags has been studied recently in [7,6,19,5]. The works rely on fast matrix multiplications (currently the fastest known ω+3 algorithm needs O(nω ) operations, with ω < 2.376 [11]) to achieve Õ(n 2 ) [6]1 and 1 Õ(n2+ 4−ω ) [19] preprocessing time on dags with n nodes and m edges. Recently, in [5] and independently [12], the upper bound of the representative LCA problem has been improved to O(n2+μ ) by applying rectangular matrix multiplication [10].2 For sparse dags, in [19,5], O(nm) algorithms have been presented as well. The authors of [5] study variants of the representative LCA problem, namely (L)CA computations in weighted dags and the A LL -PAIRS A LL LCA problem. For the latter problem, an upper bound of O(n3+μ ) is shown, as well as algorithms with scaling properties. In [6], a short experimental study of the performance of A LL -PAIRS R EPRESENTATIVE LCA algorithms is presented. It is demonstrated, that on random dags, a suboptimal but simple O(n3 ) algorithm based on range minimum queries (RMQ) and ancestor lists performs best. Results. We summarize the technical contributions of this paper. Most of the proofs are omitted due to space limitations and can be found in the full version of the paper [13]. 1. We present fast algorithms for solving the A LL -PAIRS R EPRESENTATIVE LCA and A LL -PAIRS A LL LCA problems with expected running time of O(n2 log n) and O(n3 log log n) respectively. Here, the expectation is taken over a distribution of all possible input dags, i.e., the algorithms are not randomized. We use the term expected running time in this sense throughout this work. The speed-up over recently developed methods is achieved by applying transitive reduction on the input dags. 2. We improve the recently shown upper bound for A LL -PAIRS A LL LCA from O(n3+μ ) to O(nω(2,1,1) ). This result is achieved by applying matrix multiplication to computing boolean witness matrices W (v) for each vertex v ∈ V , where W (v) [x, y] = 1 indicates that v is a CA, but not an LCA of (x, y). 1 2 Throughout this work, we use Õ(f (n)) for O (f (n) · polylog(n)). Throughout this work, ω(x, y, z) is the exponent of the algebraic matrix multiplication of a nx × ny with a ny × nz matrix. Let μ be such that ω(1, μ, 1) = 1 + 2μ is satisfied. The fastest known algorithms for rectangular matrix multiplication imply μ < 0.575 and ω(2, 1, 1) < 3.3399. Fast Lowest Common Ancestor Computations in Dags 707 3. We give first fully dynamic algorithms for both A LL -PAIRS R EPRESENTATIVE LCA and A LL -PAIRS A LL LCA . Here, the non-trivial update complexities are O(n2.5 ) and O(n3 ) respectively, with constant query times. The dynamic algorithms are based on a technique which reduces the matrix multiplications in the static solutions to transitive closure computations in corresponding dags. 4. We study LCA computations experimentally. We demonstrate that simplicity and the transitive reduction technique are fundamental to practically efficient algorithms. Furthermore, we evaluate the running time of several A LL -PAIRS R EP RESENTATIVE LCA and A LL -PAIRS A LL LCA algorithms in different settings. 2 Fast LCA Algorithms Let G = (V, E) be a directed graph. Throughout this work we denote by n the number of vertices and by m the number of edges. G is a directed acyclic graph (dag) if and only if G contains no cycles. Let Gclo = (V, Eclo ) denote the transitive closure of G, i.e., the graph having an edge (u, v) if v is reachable from u over some directed path in G. Similarly, let Gred = (V, Ered ) be the transitive reduction of G, i.e., the smallest graph G such that Gclo = Gclo . Throughout this work, let | Ered | = mred . A dag G = (V, E) imposes a partial ordering on the vertex set. Let N be a bijection from V into {1, . . . , n}. N is said to be a topological ordering if N (u) < N (v) whenever v is reachable from u in G. We refer to a vertex z which has the maximal topological number N (z) among all vertices in a set as the rightmost vertex. Let G = (V, E) be a dag and x, y, z ∈ V . The vertex z is a common ancestor (CA) of x and y if both x and y are reachable from z, i.e., (z, x) and (z, y) are in the transitive closure of G. By CA(x, y), we denote the set of all CAs of x and y. A vertex z is a lowest common ancestor (LCA) of x and y if and only if z ∈ CA(x, y) and for each z ∈ V with (z, z ) ∈ Eclo we have z ∈ CA(x, y). LCA(x, y) denotes the set of all LCAs of x and y. The algorithms we present in this section are based on the algorithms described in [5]. The following lemma is fundamental to the speed-ups that lead to practically fast algorithms. Lemma 1. For all x, y, z ∈ V , it holds that z = LCAG (x, y) if and only if z = LCAGred (x, y). The above lemma implies that LCA computations in a dag G can be restricted to the graph Gred . Observe, however, that this does not directly apply to LCA problems involving distances [5]. The restriction to the transitive reduction turns out to be of critical importance when designing practically fast algorithms for A LL -PAIRS R EPRESENTA TIVE LCA and A LL -PAIRS A LL LCA. The All-Pairs Representative LCA Problem. In [5], a dynamic programming algorithm which solves A LL -PAIRS R EPRESENTATIVE LCA in worst case time O(nm) is given. Applying Lemma 1 yields Algorithm APRLCA. Theorem 2. Algorithm APRLCA solves A LL -PAIRS R EPRESENTATIVE LCA in time O(n mred ). 708 S. Eckhardt, A.M. Mühling, and J. Nowak Although the modified algorithm offers no theoretical advantage in the worst case, its superior practical performance can be explained by considering the expected value of the parameter mred on random dags. In this section we consider random dags [4] in the Gn,p model, i.e., each possible edge in the graph is chosen independently with equal probability p. The following lemma is due to Simon [23]. Lemma 3. Let G = (V, E) be a random dag in the Gn,p model. Let mred be a random variable denoting the number of edges in the transitive reduction of G. Then E[mred ] = O(n log n) Corollary 4. Let G = (V, E) be a random dag in the Gn,p model. Then, the expected running time of Algorithm APRLCA on G is O(n2 log n). The All-Pairs All LCA Problem. We improve the algorithms for A LL -PAIRS A LL LCA presented in [5] by applying Lemma 1. Algorithm APA1 is quite natural. First, it computes the transitive closure of G in O(nm). Then, for every vertex z and every pair (x, y) it determines if z is an LCA of (x, y) in time O(out-deg(z)). If z is a CA of (x, y), but none of its children, then z is an LCA of (x, y). Checking whether a given vertex is a CA of a vertex pair can be done by simple transitive closure look-up. The total running time of the original algorithm is O(n2 m). Theorem 5. The time needed by Algorithm APA1 to solve A LL -PAIRS A LL LCA on G = (V, E) is bounded by O(n2 mred ). Let G = (V, E) be a random dag in the Gn,p model. Then, Algorithm APA1 solves A LL -PAIRS A LL LCA on G in expected time O(n3 log n). Applying Lemma 1 to Algorithm 2 in [5] yields Algorithm APA2. Like algorithm APRLCA it is based on dynamic programming. The LCA sets are iteratively constructed by merging already determined LCA sets. In the merging steps, all those vertices that are predecessors of some other vertices in the set are discarded. In [5], it is shown that the merging operation can be performed in time O(min{n, k 2 }), where k is the maximum cardinality of any LCA set. This finding allows for scaling with the maximal LCA sets. Moreover, it is possible to decide on-line which merging strategy is to be preferred. Observe, however, that even for sparse dags with m = O(n), the total size of the LCA sets can be Ω(n3 ) [5]. Theorem 6. The time needed by Algorithm APA2 to solve A LL -PAIRS A LL LCA on G = (V, E) is bounded by O(n mred min{n, k 2 }), where k is the maximum LCA set. Let G = (V, E) be a random dag in the Gn,p model. Then, Algorithm APA2 solves the A LL -PAIRS A LL LCA problem in expected time O(n2 log n min{n, k 2 }). Algorithm APA3 is achieved by applying Lemma 1 on Algorithm 4 in [5]. It is based on the following idea. suppose we are given a vertex z and want to determine all pairs (x, y) for which z is an LCA. To this end, we employ an A LL -PAIRS R EPRESENTATIVE LCA algorithm on G using a topological ordering that maximizes N (z). Since A LL -PAIRS R EPRESENTATIVE LCA algorithms return rightmost LCAs, this approach guarantees Fast Lowest Common Ancestor Computations in Dags 709 that z is returned for the appropriate pairs. This can even be improved by maximizing the topological numberings of vertices on a path simultaneously. For more details, we refer to [5]. Theorem 7. The time needed by Algorithm APA3 to solve A LL -PAIRS A LL LCA on G = (V, E) is bounded by O(n mred w), where w is the size of path cover. Let G = (V, E) be a random dag in the Gn,p model. Then, Algorithm APA3 solves A LL -PAIRS A LL LCA in expected time O(n3 ) for log2 n/n ≤ p < 1 and expected running time O(n3 log log n) for 0 < p < log2 n/n. Proof. The running time of algorithm APA3 is bounded by O(n mred w), where w is the width of the path cover of G. To analyze the expected running on a random dag, we use the following lemma [23]: Lemma 8. Let G = (V, E) be a random dag in the Gn,p model. Then, there exists an algorithm that computes a path cover of G of width w and the transitive reduction of G in time O(w mred ). Moreover O(n2 ) for log2 n/n ≤ p < 1 E[w mred ] = 2 O(n log log n) otherwise Algorithm B in [23] with running time O(n+m) yields a path cover of size w satisfying the conditions of the above lemma. Hence, the theorem follows. 3 Theoretical Improvements Improved Upper Bounds for A LL -PAIRS A LL LCA. We start by considering the following problem which we call O NE -V ERTEX A LL -PAIRS LCA . Given a dag G and a vertex v, find all pairs (x, y) such that v is an LCA of (x, y). Clearly, from a O NE V ERTEX A LL -PAIRS LCA solution for each v ∈ V , a solution to A LL -PAIRS A LL LCA can be derived in O(n3 ). O NE -V ERTEX A LL -PAIRS LCA reduces to the following. Find all pairs (x , y ) such that v is a CA of (x , y ). Then, test for each such pair, if there exists a witness, i.e., a successor of v that is also a CA of this pair, or not. To this end, it is enough to consider the children of v in G. A solution to this problem works as follows. We initialize an n × n matrix C (v) , such that C (v) [x, y] = 1 if and only if v is a CA of (x, y). Obviously, this can be done in time O(n2 ) with knowledge of the transitive closure. Let A(v) be an n × n matrix such that A(v) [x, z] = 1 if and only if x is reachable from z and z is a direct child of v, i.e., (v, z) ∈ E. Let A(v) [x, z] = 0 otherwise. This can be thought of as the transpose of the transitive closure restricted to the rows indexed with children of v, all other entries set to zero. Let A be the adjacency matrix corresponding to Gclo . Let W (v) = A(v) · A be the witness matrix of v. Lemma 9. Let W (v) be the witness matrix of vertex v. Then, v is an LCA of (x, y) if and only if W (v) [x, y] = 0 and C (v) [x, y] = 1. Hence, O NE -V ERTEX A LL -PAIRS LCA can be solved in time O(nω ). This approach leads immediately to an O(n1+ω ) algorithm for A LL -PAIRS A LL LCA . Fast rectangular matrix multiplication yields even a stronger upper bound. We can compute the 710 S. Eckhardt, A.M. Mühling, and J. Nowak witness matrices W (v) for vertices v ∈ V in one step by multiplying an n2 × n and an n × n matrix in time O(nω(2,1,1) ). ⎛ (v1 ) ⎞ ⎛ (v1 ) ⎞ A W ⎜ A(v2 ) ⎟ ⎜ W (v2 ) ⎟ ⎜ ⎟ ⎜ ⎟ (1) ⎜ .. ⎟ · A = ⎜ .. ⎟ ⎝ . ⎠ ⎝ . ⎠ A(vn ) W (vn ) Theorem 10. A LL -PAIRS A LL LCA can be solved in time O(nω(2,1,1) ). Proof. The algorithm works as follows: 1. Compute the transitive closure of G 2. Compute the common ancestor matrices C (v) for all v ∈ V . Since Gclo is known, this can be achieved in time O(n3 ). 3. Compute the witness matrices W (v) for all v ∈ V using Equation (1). This takes time O(nω(2,1,1) ). Currently, upper bounds for rectangular matrix multiplication imply ω(2, 1, 1) < 3.3399. 4. Let L(v) = C (v) − W (v) for each v ∈ V . Then, for each pair (x, y), all LCAs can be read from the entries L(v) [x, y] for each v. This step takes a total of O(n3 ) time. Dynamic Algorithms for A LL -PAIRS R EPRESENTATIVE LCA. We show how to solve the fully dynamic A LL -PAIRS R EPRESENTATIVE LCA problem with update time O(n2.5 ) and query time O(1). We consider vertex-centered updates, that is, given a vertex v, edges incident to v can be arbitrarily added or deleted in one update step. The approach is based on the matrix multiplication-based static A LL -PAIRS R EPRESENTA TIVE LCA solution in [5]. The static algorithm is based on rectangular matrix multiplications which serve as CA existence computations. To this end, the vertices are divided into l = n1−r sets V 1 , . . . , V l of size nr for some r ∈ [0, 1]. To ease exposition, we assume in the sequel that nr and n1−r are integers. For each vertex set V i one rectangular matrix product of an n × nr and an nr × n matrix is computed in order to identify all pairs for which a CA in the set V i exists. The rightmost LCAs are then searched in the vertex set with the largest index i. In order to improve the upper bound for vertexcentered updates, we reduce the rectangular matrix multiplications to transitive closure computations in dags G1 , . . . , Gl . The reduction has the following properties: 1. The adjacency matrices of the dags Giclo correspond to the results of the rectangular matrix products in the static solution. 2. Vertex-centered updates incur at most a constant amount of vertex-centered updates in each of the dags Gi . Each of the transitive closures of the dags Gi can be updated in time O(n2 ) using recent results [21]. Theorem 11. There exists an algorithm for dynamic A LL -PAIRS R EPRESENTATIVE LCA with O(n1−r+ω + n2+r ) initialization, O(n1−r+2 + n2+r ) update, and O(1) query time. Optimizing r for updates, we get O(n2.876 ) initialization, O(n2.5 ) update, and O(1) query time. Fast Lowest Common Ancestor Computations in Dags 711 Dynamic Algorithms for A LL -PAIRS A LL LCA. For A LL -PAIRS A LL LCA we achieve O(n1+ω ) initialization, O(n3 ) update and O(1) query time for fully dynamic vertex-centered updates. Here, the key is to speed up witness computations in the dynamic setting. Again, the dynamic speed-up is achieved by using transitive closure computations instead of matrix products for the witness matrix computations. The application of the reduction technique described above to A LL -PAIRS A LL LCA is straightforward. Theorem 12. There exists an algorithm for dynamic A LL -PAIRS R EPRESENTATIVE LCA with O(n3.376 ) initialization, O(n3 ) update, and O(1) query time. Observe that O(n3 ) update time is optimal in the worst case, since a single update can change up to Θ(n3 ) entries. 4 Experiments Implementation.3 We have implemented Algorithms APRLCA, APA1, APA2, and APA3 described in Section 2 in C++. For comparison with two of the algorithms that were subject to the experimental study in [6], we adapted the code provided by the authors to fit in our C++ framework. The third algorithm tested in [6] was ruled out due to its inferior performance. This finding is in accordance with the results of the study presented in [6]. RMQuery is based on combining ancestor lists for each of the vertices with LCA queries on a spanning tree of G. The running time of the algorithm is O(n3 ), but as a result of the experimental study in [6], it is efficient in practice. TCQuery is based on transitive closure queries. It computes Gclo first and then chooses the rightmost CA of a pair (x, y) by comparing the corresponding rows of the adjacency matrix of Gclo . The running time of the algorithm is Θ(n3 ). Additionally, we implemented the two transitive closure algorithms described in [23]: the algorithm of Goralćı́ková and Koubek[14](GK) with expected running time of O(n2 log n) and the algorithm of Simon [23] (Simon) with expected running time of O(n2 log log n). Our preprocessing routines are complemented by the greedy algorithm for computing a path cover of G with running time O(n + m)[23]. Both of the transitive closure algorithms naturally compute the transitive reduction and are used to accomplish both tasks. Approaches based on fast algebraic matrix multiplication were excluded from this study. These methods are widely believed to be not efficient in practice. The tests were done on a system with an Athlon XP 3000+ clocked at 2.154 GHz with 2 GB RAM and running on Linux. Test Data. We tested the algorithms on three families of dags: – Gn,p random dags: In order to mirror random dags of varying density we use the parameter p to control the expected number of edges in the dag. In all of our experiments we considered sparse (E[m] = Θ(n)), medium (E[m] = Θ(n log n)), and dense E[m] = Θ(n2 ) graphs. – Power law random dags: Modeling real world data often leads to graphs in which the degrees of the vertices satisfy the power law. That is, the number of vertices of 3 Our implementations and the used test data is available at http://wwwmayr.informatik. tu-muenchen.de/personen/nowakj/lca/ 712 S. Eckhardt, A.M. Mühling, and J. Nowak degree k is proportional to k −α for some constant α > 1. This is also true for dags, e.g., a citation graph compiled from crawling scientific literature obey a power law with α ≈ 1.7 [3]. There are numerous examples of large-scale networks which fall into the category of power law graphs. We generated random dags in which the out-degrees of the vertices have expected degrees following a power law slightly adapting the Chung-Law model [3]. First, target out-degrees of the vertices are generated according to the power law. Suppose that the vertices are sorted in decreasing order according to their target degrees. Then, for each vertex pair (i, j), i < j, an edge is added with probability degi /(n − i), where degi is the target degree of i. We generated dags for α = 2 and α = 1.5, representing common exponents. – Real-world data sets: Our real-world data sets include the Internet dag, the citation dag, and phylogenetic networks. The Internet dag is derived from business relationships of autonomous systems (ASes) in the Internet. Our data set is obtained from observable BGP routes. From these routes an acyclic AS graph is inferred with the method proposed in [18]. We use the same data set as in [16], where a detailed description of the data collection and postprocessing steps can be found. The final dag has 22,218 vertices and 57,413 edges. The citation dag is compiled from Citeseer’s OAI publicly available records4. Similar graphs have been studied and analyzed extensively in the past, see [3] and references therein. Computing LCAs in such dags may provide valuable information, e.g., which papers are original to two or more papers on a particular topic. The dag has 716,772 vertices and 1,331,948 edges. The data provided by CiteSeer is automatically obtained by crawling the web and not free of errors. We first parsed the documents and created a citation graph, where an edge (x, y) was added whenever y references x. We then manually cleaned the data by ordering the vertices according to their timestamps (which are part of the data) and deleting edges that are not consistent with the partial order imposed by the timestamps. We considered two phylogenetic networks for which the data sets are included as examples in the SplitsTree4 [17] package. The first network represents evolutionary relationships of a set of mammals. It has 498 vertices and 889 edges. The second network represents evolutionary relationships between a single protein, namely myosin, a protein involved in muscle contraction, in different organisms. The network has 5462 vertices and 10,321 edges. The Internet dag and the citation dag had to be sampled in order to be processed by our algorithms. We were interested in dense subgraphs in order to evaluate the effect of the transitive reduction in real-world dags. To achieve this, we sampled the graphs by performing breadth first searches starting at high-degree vertices. To sample a graph of size n, we choose the dag which is induced by the first n visited vertices. The sampling imposes a bias towards dense subgraphs. However, most of the instances generated are still sparse in the sense that the average degree of the vertices is a small constant, e.g., less than 5 in the case of the citation graph samples. Although the effect of the transitive reduction on sparse dags is small, our O(nm) dynamic programming approach is naturally by far the best choice for such dags. 4 http://citeseer.ist.psu.edu/oai.html Fast Lowest Common Ancestor Computations in Dags 713 Results. Due to space limitations, we cannot present all experimental data in this work. The full set of results is given in [13]. Transitive Reduction: We first compared the two methods for creating the transitive reduction of a dag. Simon clearly outperforms GK on medium and dense dags. Moreover, the space requirement for GK is significantly larger than for Simon imposing a limiting factor on large problem instances. Consequently, this method was used to create the transitive reductions in all further experiments. We also compared the effect of the transitive reduction in our real world data sets with comparable Gn,p dags, i.e., dags having the same number of vertices and same expected number of edges. Interestingly, the effect seems to be larger in the real world data sets. This may imply that—at least in sparse dags—the benefit of using transitive reduction as an algorithmic speed-up is even greater in non-random dags. All-Pairs Representative LCA: We compared the performance of Algorithm APRLCA with the two methods presented in [6]. Separately, we tested the result of using transitive reduction with the RMQuery algorithm. The effect is very small compared to APRLCA, but improves the running time slightly. Subsequently, transitive reduction 200 1000 RMQuery 180 140 700 120 600 100 TCQuery 800 Time [s] Time [s] RMQuery APRLCA 160 APRLCA 900 TCQuery 500 80 400 60 300 40 200 20 100 0 500 1000 1500 2000 2500 3000 Vertices 3500 4000 4500 0 5000 0 0.2 0.4 0.6 0.8 1 1.2 Vertices 1.4 1.6 1.8 2 4 x 10 (a) Performance of APR-algorithms on power law dags with (b) Performance of APR-algorithms on the Internet dag α=2 samples −4 4.5 x 10 −5 9 x 10 RMQuery 4 8 3.5 7 3 6 Time per vertex pair Time per vertex [s] APRLCA 2.5 2 1.5 4 3 2 0.5 1 1000 1500 2000 2500 3000 Vertices 3500 4000 4500 5000 (c) APRLCA as time per vertex on dense dags APRLCA 5 1 0 500 TCQuery 0 500 1000 1500 2000 2500 3000 Vertices 3500 4000 4500 5000 (d) TCQuery and RMQuery measured as time per vertex pair on dense dags Fig. 1. Evaluation of APR-algorithms 714 S. Eckhardt, A.M. Mühling, and J. Nowak Table 1. Experimental results on phylogenetic networks n m mred t(TR) avg#LCAs t(RMQuery) t(TCQuery) t(APRLCA) t(APA1) t(APA2) t(APA3*) t(APA3) Mammals 498 889 847 0.01 1.52574 0.88 1.17 0.02 1.38 1.34 8.7 1.741 Myosin 5462 10321 9997 1.09 1.14211 1159.96 545.623 4.56 1142.27 589.466 1976.1 1540.261 is used for APRLCA and RMQuery, whereas, TCQuery clearly does not benefit from transitive reduction. A sample of the results can be seen in Figure 1. The dynamic programming algorithm with transitive reduction proves to be the algorithm of choice. The benefits of using the transitive reduction increase significantly at higher densities. On the other hand, APRLCA with running time O(nm) is naturally the best choice on sparse dags clearly outperforming RMQuery and TCQuery. Experimental results on all families of dags support this finding. The full set of experimental data can be found in [13]. We measured the running times of APRLCA, RMQuery, and TCQuery, as time per vertex, i.e., dividing the running times by n, and again RMQuery and TCQuery as time per vertex pair, i.e., dividing the times by n(n − 1)/2. As a result, we conclude that the asymptotic improvement of APRLCA over the two other methods is roughly a factor of n. In particular, Figure 1(d) indicates that both algorithms have a cubic running time. 1500 6000 APALCA 1 APALCA 1 APALCA 2 APALCA 2 5000 APALCA 3* APALCA 3 APALCA 3 1000 Time [s] Time [s] 4000 500 3000 2000 1000 0 500 1000 1500 2000 2500 3000 Vertices 3500 4000 4500 0 5000 (a) Performance of APA-algorithms on dense Gn,p dags 0.8 1000 1500 2000 2500 Vertices 3000 3500 4000 4500 0.7 APALCA 1 APALCA 1 APALCA 2 0.6 APALCA 3* APALCA 2 APALCA 3* APALCA 3 APALCA 3 0.5 Time per vertex [s] 0.7 Time per vertex [s] 500 (b) Performance of APA-algorithms on the Internet dag samples 1 0.9 0 0.6 0.5 0.4 0.3 0.4 0.3 0.2 0.2 0.1 0.1 0 500 1000 1500 Vertices 2000 2500 0 500 1000 1500 2000 Vertices (c) APA-algorithms on medium Gn,p dags as time per ver- (d) APA-algorithms on power law dags (α = 1.5) as time tex per vertex Fig. 2. Evaluation of APA-algorithms Fast Lowest Common Ancestor Computations in Dags 715 The effect of the transitive reduction on the very sparse phylogenetic networks is very small, see Table 1. For A LL -PAIRS R EPRESENTATIVE LCA , the number of edges should be roughly halved by the transitive reduction in order to justify the additional expenses for computing the transitive reduction. All-Pairs All LCA: In addition to APA1, APA2, and APA3, we considered the performance of algorithm APA3*, which is essentially APA3 without using a path cover. The comparison of APA3 and APA3* enables direct evaluation of the effect of using a path cover. The large output complexity of the problem limits the tests to considerably smaller graphs, i.e., n ≤ 5000. Moreover, on hard problem instances, i.e., instances having large LCA sets, the restriction on the number of vertices is even larger. A sample of the results is given in Figure 2. APA2 is the best choice on Gn,p dags, whereas APA1 leads on power law dags and the real world data sets. Although APA3 has the best guaranteed expected running time, the overhead caused by the more complicated data structures does not pay off for the considered problem sizes. The good performance of APA2 can be explained by the fact that the LCA sets are usually small. The average number of LCAs for a pair is less then 2 in most of our experiments, especially on large, dense dags. Our results suggest that in most cases we can bound the number of LCAs for each pair by a constant. This implies an O(n2 log log n) running time of APA2. This is also supported by plotting the running times of the APA-algorithms per vertex. The effects of using a path cover to maximize the topological number of several vertices in one step over maximizing only a single vertex can be estimated by comparing APA3 and APA3*. While on sparse graphs, the additional cost of computing and maintaining the cover does not pay off, the effects on medium and dense dags are considerable. Interestingly, medium sized dags turn out to be the most difficult problem instances both in terms of running time and output size. We evaluated the the maximal and average size of the LCA sets in dependence of the dag densities. We conjecture that the values peak out in Gn,p dags for p ≈ polylog(n)/n which is supported by our experimental findings. In power law dags, the values roughly peak out at α = 1.5. This may explain the relatively inferior performance of APA2 on these dags. References 1. Aho, A.V., Hopcroft, J.E., Ullman, J.D.: On finding lowest common ancestors in trees. SIAM J. Comput. 5(1), 115–132 (1976) 2. Aı̈t-Kaci, H., Boyer, R.S., Lincoln, P., Nasr, R.: Efficient implementation of lattice operations. ACM Trans. Prog. Lang. Syst. 11(1), 115–146 (1989) 3. An, Y., Janssen, J., Milios, E.E.: Characterizing and mining the citation graph of the computer science literature. Knowl. Inf. Syst. 6(6), 664–678 (2004) 4. Barak, A.B., Erdős, P.: On the maximal number of strongly independent vertices in a random directed acyclic graph. SIAM J. on Algebraic and Discrete Methods 5(4), 508–514 (1984) 5. Baumgart, M., Eckhardt, S., Griebsch, J., Kosub, S., Nowak, J.: All-pairs common-ancestor problems in weighted directed acyclic graphs. In: Proc. ESCAPE’07. LNCS, vol. 4614, pp. 282–293. Springer, Heidelberg (2007) 6. Bender, M.A., Farach-Colton, M., Pemmasani, G., Skiena, S., Sumazin, P.: Lowest common ancestors in trees and directed acyclic graphs. J. Algorithms 57(2), 75–94 (2005) 716 S. Eckhardt, A.M. Mühling, and J. Nowak 7. Bender, M.A., Pemmasani, G., Skiena, S., Sumazin, P.: Finding least common ancestors in directed acyclic graphs. In: Proc. SODA’01, pp. 845–854 (2001) 8. Berkman, O., Vishkin, U.: Finding level-ancestors in trees. J. Comput. Syst. Sci. 48(2), 214– 230 (1994) 9. Cole, R., Hariharan, R.: Dynamic LCA queries on trees. SIAM J. Comput. 34(4), 894–923 (2005) 10. Coppersmith, D.: Rectangular matrix multiplication revisited. J. Complexity 13(1), 42–49 (1997) 11. Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation 9(3), 251–280 (1990) 12. Czumaj, A., Kowaluk, M., Lingas, A.: Faster algorithms for finding lowest common ancestors in directed acyclic graphs. ECCC (111) (2006) 13. Eckhardt, S., Mühling, A.M., Nowak, J.: Fast lca computations in directed acyclic graphs. Technical Report TUM-I0707, Inst. f. Informatik, TU München (2006) 14. Goralćı́ková, A., Koubek, V.: A reduct-and-closure algorithm for graphs. In: Becvar, J. (ed.) Mathematical Foundations of Computer Science 1979. LNCS, vol. 74, pp. 301–307. Springer, Heidelberg (1979) 15. Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984) 16. Hummel, B., Kosub, S.: Acyclic type-of-relationship problems on the internet: An experimental analysis. Technical report, Inst. f. Informatik, TU München (2007) 17. Huson, D.H., Bryant, D.: Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23(2), 254–267 (2006) 18. Kosub, S., Maaß, M.G., Täubig, H.: Acyclic type-of-relationship problems on the internet. In: Erlebach, T. (ed.) CAAN 2006. LNCS, vol. 4235, pp. 98–111. Springer, Heidelberg (2006) 19. Kowaluk, M., Lingas, A.: LCA queries in directed acyclic graphs. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 241–248. Springer, Heidelberg (2005) 20. Nykänen, M., Ukkonen, E.: Finding lowest common ancestors in arbitrarily directed trees. Inf. Process. Lett. 50(1), 307–310 (1994) 21. Sankowski, P.: Dynamic transitive closure via dynamic matrix inverse (extended abstract). In: Proc. FOCS’04, pp. 509–517 (2004) 22. Schieber, B., Vishkin, U.: On finding lowest common ancestors: Simplification and parallelization. SIAM J. Comput. 17(6), 1253–1262 (1988) 23. Simon, K.: An improved algorithm for transitive closure on acyclic digraphs. Theor. Comput. Sci. 58, 325–346 (1988) 24. Tarjan, R.E.: Applications of path compression on balanced trees. J. ACM 26(4), 690–715 (1979) 25. Wang, B.-F., Tsai, J.-N., Chuang, Y.-C.: The lowest common ancestor problem on a tree with an unfixed root. Information Sciences 119(1–2), 125–130 (1999) 26. Wen, Z.: New algorithms for the LCA problem and the binary tree reconstruction problem. Inf. Process. Lett. 51(1), 11–16 (1994)