© 2012 Prof. Dr. Franz J. Brandenburg Ranking Problems with Incomplete Information Fixed Parameter Tractability of Distance Problems Franz J. Brandenburg University of Passau, Germany 1 Survey © 2012 Prof. Dr. Franz J. Brandenburg • • • • the problem the motivation the solution new open problems 2 Similarity of Permutations © 2012 Prof. Dr. Franz J. Brandenburg Definition: Given two total orders or permutations s and t on {1,2,....,n}. How can we measure their dissimilarity? 1) count mismatches --> Kendall-tau 2) count moves --> Spearman footrule 3) others (Hamming, count exchanges of x‘s and y‘s) rotate the middle swap the extremes 3 Kendall tau Distance © 2012 Prof. Dr. Franz J. Brandenburg Definition: Given two total orders or permutations s and t on {1,2,....,n}. The Kendall-tau (or Kemeny) distance K( ) is K(s, t) = #{(x, y) | x < y and (s(x)–t(x))•(s(y)–t(y)) < 0} = # disagreements: x <s y and y <t x, = # the „dirty“ pairs = # inversions (swaps) to transform s into t. = bubbleSort distance rotate the middle, K( ) = 2 swap the extremes, K( ) = 6 4 Kendall-tau © 2012 Prof. Dr. Franz J. Brandenburg Kendall-tau distance named after Maurice Kendall in 1938 (statistics) invented by Gustav Fechner in 1897 Kendall-tau distance = two-layer crossing problem There is an O(n logn) algorithm to compute - the number of crossings of n lines - the Kendall-tau distance of two total orders Open problem Is there an O(n) algorithm? Compute the inversion numbers (D.E. Knuth 1968) inv(i) = #{ j | j > i and j left of i} Updates in O(log n) in a search tree. 5 Spearman-Footrule distance © 2012 Prof. Dr. Franz J. Brandenburg Spearman-footrule distance or Spearman's rho named after C. Spearman (1904) (correlation ranking in statistics) compute displacements of two permutations move an element by k units the L1 - vector norm F(s , t) = ∑i |(s(i)–t(i)| value 8 value 2 Lemma For total orders the Spearman footrule distance can be computed in O(n) 6 Diaconis-Graham © 2012 Prof. Dr. Franz J. Brandenburg Theorem The Diaconis-Graham inquality (1977) K(s, t) ≤ F(s, t) ≤ 2•K(s, t) each crossing/mismatch/swap induces a displacement each displacement is repaired by two crossings. 7 Incomplete Information © 2012 Prof. Dr. Franz J. Brandenburg total order x < y or y < x ties for every pair of candidates x and y x ~ y x and y are equivalent (an equivalence relation) I don't care for x and y bucket orders with equivalent items in a bucket and a total order for the buckets partial order x?y x any y are unrelated „apples and oranges“ ? is not transitive interval orders, hierarchical orders (trees) contradictory relations with cycles (from Lullus, 1299) 8 Generalization © 2012 Prof. Dr. Franz J. Brandenburg Given: a set of candidates X = {x1,...,xn} or simply {1,...,n} a partial order π on X, and X is partially ordered by > say x > y if x has a higher ranking if there is a preference for x properties: transitive: x > y and y > z x > z partial: many pairs are unrelated Djokovic Nadal Example: The Australian Open 2012 Murray a partial order imposes Djokovic beats Federer Murray and Nadal / Federer are incomparable Federer 9 Partial Orders © 2012 Prof. Dr. Franz J. Brandenburg Given: A set of candidates X and a partial order π Representation of π: a DAG (directed acyclic graph) with transitive edges vertices X = {1,...,n} directed edges x ---> y if x > y in π the DAG displays only the generating edges, the transitive reduction 8 2 1 3 5 6 4 7 e.g. 1 and 8 are unrelated, and 8 > 2, 8 > 3, 8 > 5, 2 and 3 unrelated 10 Extensions © 2012 Prof. Dr. Franz J. Brandenburg How shall we compare two partial orders? ... via their sets of extensions Ext(π) = {total orders t | t does not disagree with π} π(i) < π(i) t(i) < t(i) Ext(π) = {any order obtained from the DAG of π by topological sorting} Example: Ext(π = Ø} = all permutations Ext(π) = {all shuffles from left (8,2,3,5), (8,3,2,5) and right (1, 7,4,6), (1,4,7,6), (1,4,6,7)} 8 2 1 3 5 6 4 7 11 topSort © 2012 Prof. Dr. Franz J. Brandenburg An extension of π is a topological sorting topsort: do { get any source x (no incoming edges); print x; delete x; while (there are vertices) } Ext(π) = the set of all topsort runs; all possibilities for "any" Theorem (Brightwell, Winkler, ACM STOC 1991) Computing |Ext(P)| is #P complete. 8 – breadth first – heap (min-heap – best 8,1, 2, 3, 7, 4, 5, 6 1, 4, 6, 7, 8, 2, 3, 5 1, 8, 2, 3, 4, 5, 6, 7 2 1 3 5 6 4 7 12 Distance Measures © 2012 Prof. Dr. Franz J. Brandenburg Given: a partial order π and a total (partial) order t What is their distance? nearest neighbor distance KNN(π, t) = min {K(s, t) | s is an extension of π} Interpretation: the positive view, there is some extension of π at distance ≤ k Hausdorff (farthest neighbor) distance KFN(π, t) = max {K(s, t) | s is an extension of π} Interpretation: the negative view all extensions are within distance ≤ k breadth first heap (min-heap) best 8,1, 2, 3, 7, 4, 5, 6 1 ,4, 6, 7, 8, 2, 3, 5 1, 8, 2, 3, 7, 4, 5, 6 K( , id) = 10 K( ,id) = 11 K( , id) = 6 13 Distance Measures © 2012 Prof. Dr. Franz J. Brandenburg nearest neighbors = closest red-blue pair or min min farthest neighbors = closest red-blue pair or max max Hausdorff distance = max {min distance{red,blue}} center distance = min {max distance{red,blue}} 14 Measures © 2012 Prof. Dr. Franz J. Brandenburg Hausdorff distance (Felix Hausdorff 1968-1942) is a metric. nearest neighbor, farthest neighbor, center distance are not ! since d(X,Y) = 0 does not imply X=Y and no triangle inequality In R2, for points p = (x,y) all four distances are in Q(n log n). 15 for Partial Orders © 2012 Prof. Dr. Franz J. Brandenburg 1 2 3 4 5 6 7 8 8 8 1 2 3 1 7 4 5 6 (breadth first) 2 Kendall-tau = 10 Spearman = 18 3 6 5 7 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 4 6 7 1 8 2 3 8 2 3 5 (min-heap) Kendall-tau = 11 Spearman = 22 4 4 5 6 7 (opt) Kendall-tau = 6 Spearman = 12 16 Applications © 2012 Prof. Dr. Franz J. Brandenburg • ranking problems – in sport: who is the champion ranking – in metasearch aggregate data from several search engines top-k lists – voring systems 17 Sport © 2012 Prof. Dr. Franz J. Brandenburg • Who was the best Formula 1 driver 2011? • Sebastian Vettel, Ger • Jenson Button, Eng • Mark Webber, Aus 392 pts 270 pts 258 pts Schema: weighted Borda scores, sum points (25,18,...,1) • Who is the best tennis player? The best possible ranking? – the winner of the Australian open? but Djokovic did not play Federer? – the aggregate winner of the for Grand Slams? evaluate the data from four trees: incomplete data The ranking list is by weighted Borda scores • An alternative: US-sports • phase 1: scores • phase 2: finals 18 Meta Search © 2012 Prof. Dr. Franz J. Brandenburg • meta search machines (incl. google) aggregate the rankings (results) of many single searchers searcher_1 = (2,4,3,5,1,...) searcher_2 = (4,1,3,5,2,...) searcher_3 = (5,3,2,1,4,...) e.g. for hotels, flights, rental cars,... but in practice many providers do give you the best offer. They make the $$$ How do they aggregate? How do they compute the ranking? the top - k list? the first page? 19 Rank Aggregation © 2012 Prof. Dr. Franz J. Brandenburg The rank aggregation problem: Given: a collection of total orders / permutations over n elements (t1, t2,...,tm) Problem: the best compromise (Kemeny score) a permutation or a linear order t* such that distance(t*, t1, t2,...,tm) = ∑i distance(t*, ti) MIN treat every voter as fair as possible (Biedl, B., Deng, Disc. Math 2009) maxi {distance (t*, ti)} MIN As a decision problem: Given k: Is there a t* with distance(t*, t1, t2,...,tm) ≤ k ? 20 Facts © 2012 Prof. Dr. Franz J. Brandenburg Theorem: The rank aggregation problem under the Kendall-tau distance (1) is NP hard - for many voters (Bartoldi, Tovey, Trick, 1989) - for even numbers > 4 (Dwork et al, WWW 2001) by a complex reduction from feedback arc set (small corrections by Biedl, Brandenburg, Deng Disc. Math. 2009) - in the max version (Biedl et al 2009) (2) in O(n) for two voters (you and your boy/girlfriend/husband...) ... take any of the two or any inbetween (crossing) 21 Facts (2) © 2012 Prof. Dr. Franz J. Brandenburg (3) Kemeny score (rank aggregation with Kendal tau) is fixed parameter tractable for several parameters Kemeny score number of candidates max and average range of candidate positions (Betzler,Fellows, Guo, Niedermeier, Rosamond, TCS 410(45) 2009) In contrast: (4) rank aggregation under the Spearman footrule distance in O(n3) for any number of voters (and even with local weights) (Dwork et al, 2001) by weighted matching: for i,j = 1,..., n set wi,j = What does it cost to place i at j? ... and matching does the rest. 22 Results on Distances © 2012 Prof. Dr. Franz J. Brandenburg Given: a partial order π and a total order id = (1,...,n) nearest neighbor distance K(π, id) a total order s in Ext(π) such that K(s,id) ––> MIN Theorem (Brandenburg, Gleissner, Hofmeier, Walcom 2012 / J. Comb. Opt. to appear) The nearest neigbor distance problem is NP-hard, Kendall tau: by reduction from one-sided crossing minimization in the version OSCM-4-stars Spearman: by reduction from clique fixed mobile Idea: the lower level is π with 4 elements per point no relations between points, and extra blockers and about n2 crossings 23 NP-hard: What‘s next © 2012 Prof. Dr. Franz J. Brandenburg Approximation if we cannot solve the problem exactly (if P ≠ NP) can we solve it up to some small error Theorem The nearest neigbor distance problem of a partial and a total order is 2-approximable for the Kendall tau distance ... by a reduction to a constraint feedback arc set problem on tournaments and an adaptation/improvement of the 3-approximation of Schalekamp/van Zylen using Quicksort on the feedback arc set problem is 4-approximable for the Spearman footrule distance ... using the Diaconis-Graham inequality 24 FPT © 2012 Prof. Dr. Franz J. Brandenburg a partial order π and a total order t over {1,....,n} a parameter k Problem: find an extension s of π in polynomial time such that – distance(s, t) ≤ k – or show that all extensions of π have distance at least k+1 Given: Theorem (Brandenburg, Hofmeier, Gleißner, Walcom 12, J.Comb. Opt) The distance problems for a partial order π and a total order t – for nearest neighbor Kendall tau distance – for nearest neighbor Spearman footrule distance are fixed parameter tractable with a linear kernel. Transform the problem into a small version of size 2k. 25 Intuition © 2012 Prof. Dr. Franz J. Brandenburg .... the distance problem is NP-hard but Suppose there are 1.000 elements and k=100. Then at most 100 "critical" pairs may cross at least 800 elements are not involved. ...x ..y... in π ...y ..x... in t GOAL: find these "800" elements and remove them solve the problem only on the "critical" pairs - naive by exhaustive search on all ≤ k! extensions of π. TODO: Improve upon the search 26 Our Key: a Derivation © 2012 Prof. Dr. Franz J. Brandenburg Given: a partial order π and a total order t over X = {1,....,n} The derivation t(π) of π in direction t is a binary relation over X For two elements x,y let x < y in t(π) by (i) agreement x < y both in π and t (ii) overrule x < y in π but x > y in t (iii) takeover x y in π and x < y in t Example 1 < 4,6,7 1 < 2,3,5,8 8 < 2,3,5 1,4,6,7 < 8 cycle : 8 < (overrule 2 by agreement by takeover by overrule by takeover < 4 takeover < 8 8 2 1 3 5 6 4 7 takeover) ... and only 8 is involved in eight cycles, excluding 1. 27 © 2012 Prof. Dr. Franz J. Brandenburg Derivation t(π) Lemma (Brandenburg, Gleissner, Hofmeier Walcom 2012, J. Comb. Opt.) t(π) is complete, defined for all x,y. t(π) may have cycles. A cycle is made from one overrule and two overtakes. Proof: Completeness is by definition Cycles by a case analysis for (x,y,z) If follows from the transitivity of π and t that an agreement cannot be part of a cycle and that two overrules x < y, y < z imply x < z by the transitivity of a partial order 28 Cycle Rule © 2012 Prof. Dr. Franz J. Brandenburg Given: a partial order π and a total order t over X = {1,....,n} a parameter k Cycle rule for the reduction: For every element x remove x and keep k if x is not in a cycle, in fact not in a triangle x < y < z < x of t(π) and there is no overrule on x. Example There is no cycle and no overrule on 1. All cycles use 8 -- 2, 8 --3, 8 -- 5 8 2 1 3 5 6 4 7 29 Proof © 2012 Prof. Dr. Franz J. Brandenburg Lemma The cycle rule preserves the Kendall-tau distance. Proof: (sketch) Consider a nearest neighbor of s in Ext(π) with K(s, t) ≤ k For every x which is removed define pred(x) = {y | y < x in t(π)} succ(x) = {z | x < z in t(π) }. Claim 1: There is an extension π* of π such that y < x for every y pred(x) and x < z for every z succ(x), pred(x) x succ(x) in π* Then x serves as a "separator". pred(x) x succ(x) in t otherwise, consider the first y pred(x) with y < x in t(π) and x < y in π*. Then ... by some case analysis ...y and its left neighbor can be swapped which contradict to "y is the first„ This needs some more work (see our papers). 30 Proof © 2012 Prof. Dr. Franz J. Brandenburg In the running example, 1 can be removed. Place 1 at the first position in the extension of π. 8 2 1 8 2 3 4 5 6 7 3 5 1 2 3 4 5 6 7 8 1 6 4 7 then K( ) = 6 31 Kernel © 2012 Prof. Dr. Franz J. Brandenburg Lemma The nearest neighbor Kendall tau distance between π and t is ≤ k if after the cycle rule i.e. after the removal of all "separators" x there is an instance with at most 2k elements which has K(π2k, t2k) ≤ k. Find this solution by exhaustive search an the 2k! extensions of π2k. Conclusion The distance problem is FPT. TODO Improve the search for K(π2k, t2k) ≤ k. 32 © 2012 Prof. Dr. Franz J. Brandenburg some open problems for parameterized complexity 33 Bucket Orders © 2012 Prof. Dr. Franz J. Brandenburg Theorem (Fagin et al 2006) There are O(n logn) algorithms to compute the distances (Kendall-tau aud Spearman) between bucket orders. with x < y and ties x ~ y The buckets are totally ordered. Theorem The rank aggregation problem is – NP-hard for total orders under Kendall-tau (Dwork et al 2001) – in P for many total orders under Spearman (Dwork et al 2001). – NP-hard for many bucket orders under Spearman (Brandenburg, Gleissner, Hofmeier FAW-AAAI 2011) OPEN: Is it FPT? 34 1-planarity © 2012 Prof. Dr. Franz J. Brandenburg Definition (G. Ringel, 1965) A graph G is 1-planar if each edge is crossed at most once (by all other edges) Properites an edge coloring black with crossings red x blue a 6-vertex coloring (Borodin 1984) #edges < 4n-8 (Pach, Toth 1997, and others) not closed under edge contraction there are infinitely many minimal non-1-planar graphs (Korzhik, 2007) test is NP-hard (Korzhik, Mohar Graph Drawing 2008, LNCS 5166) 35 1-planar + Rotation System © 2012 Prof. Dr. Franz J. Brandenburg Definition a rotation system (embedding) of a graph G = (V,E) is the cylic order of the edge (neighbors) of v for each vertex v The crossing pair system of a graph G = (V,E) is G together with all pairs (e,e‘) of crossing edges. Lemma Given a crossing pair system. Test for 1-planarity is in O(n), and there is a straight-line drawing of G on a polynomial size grid. Claim (under work) (Auer, Brandenburg, Gleißner, Reislhuber) Given a rotation system: Test for 1-planarity is NP-hard .... by a reduction from planar 3-SAT 36 Parameterized Complexity © 2012 Prof. Dr. Franz J. Brandenburg Given: a graph G = (V,E) a parameter k Problem: Is G 1-planar with at most k pairs of crossing edges? Given: G with a rotation system and k Problem: Is G 1-planar with at most k pairs of crossing edges? Given: a directed graph G = (V,E) a parameter k Problem: Is G upward 1-planar with at most k pairs of crossing edges? i.e. G has a 1-planar drawing such that all edges are upward (Y-monontone) I need your help! Thank you 37