Brandenburg-ranking

advertisement
© 2012 Prof. Dr. Franz J. Brandenburg
Ranking Problems with
Incomplete Information
Fixed Parameter Tractability
of Distance Problems
Franz J. Brandenburg
University of Passau, Germany
1
Survey
© 2012 Prof. Dr. Franz J. Brandenburg
•
•
•
•
the problem
the motivation
the solution
new open problems
2
Similarity of Permutations
© 2012 Prof. Dr. Franz J. Brandenburg
Definition:
Given two total orders or permutations s and t on {1,2,....,n}.
How can we measure their dissimilarity?
1) count mismatches --> Kendall-tau
2) count moves
--> Spearman footrule
3) others (Hamming, count exchanges of x‘s and y‘s)
rotate the middle
swap the extremes
3
Kendall tau Distance
© 2012 Prof. Dr. Franz J. Brandenburg
Definition:
Given two total orders or permutations s and t on {1,2,....,n}.
The Kendall-tau (or Kemeny) distance K( ) is
K(s, t) = #{(x, y) | x < y
and (s(x)–t(x))•(s(y)–t(y)) < 0}
= # disagreements: x <s y
and y <t x,
= # the „dirty“ pairs
= # inversions (swaps) to transform s into t.
= bubbleSort distance
rotate the middle, K( ) = 2
swap the extremes, K( ) = 6
4
Kendall-tau
© 2012 Prof. Dr. Franz J. Brandenburg
Kendall-tau distance
named after Maurice Kendall in 1938 (statistics)
invented by Gustav Fechner in 1897
Kendall-tau distance = two-layer crossing problem
There is an O(n logn) algorithm to compute
- the number of crossings of n lines
- the Kendall-tau distance of two total orders
Open problem
Is there an O(n) algorithm?
Compute the inversion numbers (D.E. Knuth 1968)
inv(i) = #{ j | j > i and j left of i}
Updates in O(log n) in a search tree.
5
Spearman-Footrule distance
© 2012 Prof. Dr. Franz J. Brandenburg
Spearman-footrule distance or Spearman's rho
named after C. Spearman (1904) (correlation ranking in statistics)
compute displacements of two permutations
move an element by k units
the L1 - vector norm
F(s , t) = ∑i |(s(i)–t(i)|
value 8
value 2
Lemma
For total orders the Spearman footrule distance can be computed in O(n)
6
Diaconis-Graham
© 2012 Prof. Dr. Franz J. Brandenburg
Theorem
The Diaconis-Graham inquality (1977)
K(s, t) ≤ F(s, t) ≤ 2•K(s, t)
each crossing/mismatch/swap induces a displacement
each displacement is repaired by two crossings.
7
Incomplete Information
© 2012 Prof. Dr. Franz J. Brandenburg
total order
x < y or y < x
ties
for every pair of candidates x and y
x ~ y
x and y are equivalent (an equivalence relation)
I don't care for x and y
bucket orders with equivalent items in a bucket
and a total order for the buckets
partial order
x?y
x any y are unrelated
„apples and oranges“
? is not transitive
interval orders, hierarchical orders (trees)
contradictory relations with cycles (from Lullus, 1299)
8
Generalization
© 2012 Prof. Dr. Franz J. Brandenburg
Given:
a set of candidates X = {x1,...,xn} or simply {1,...,n}
a partial order π on X, and X is partially ordered by >
say x > y if x has a higher ranking
if there is a preference for x
properties:
transitive: x > y and y > z  x > z
partial:
many pairs are unrelated
Djokovic
Nadal
Example: The Australian Open 2012
Murray
a partial order imposes Djokovic beats Federer
Murray and Nadal / Federer are incomparable
Federer
9
Partial Orders
© 2012 Prof. Dr. Franz J. Brandenburg
Given: A set of candidates X and a partial order π
Representation of π:
a DAG (directed acyclic graph) with transitive edges
vertices X = {1,...,n}
directed edges x ---> y if x > y in π
the DAG displays only the generating edges, the transitive reduction
8
2
1
3
5
6
4
7
e.g. 1 and 8 are unrelated, and 8 > 2, 8 > 3, 8 > 5, 2 and 3 unrelated
10
Extensions
© 2012 Prof. Dr. Franz J. Brandenburg
How shall we compare two partial orders?
... via their sets of extensions
Ext(π) = {total orders t | t does not disagree with π}
π(i) < π(i)  t(i) < t(i)
Ext(π) = {any order obtained from the DAG of π by topological sorting}
Example:
Ext(π = Ø} = all permutations
Ext(π)
= {all shuffles from
left (8,2,3,5), (8,3,2,5) and
right (1, 7,4,6), (1,4,7,6), (1,4,6,7)}
8
2
1
3
5
6
4
7
11
topSort
© 2012 Prof. Dr. Franz J. Brandenburg
An extension of π is a topological sorting
topsort: do {
get any source x (no incoming edges);
print x;
delete x;
while (there are vertices)
}
Ext(π) = the set of all topsort runs; all possibilities for "any"
Theorem (Brightwell, Winkler, ACM STOC 1991)
Computing |Ext(P)| is #P complete.
8
– breadth first
– heap (min-heap
– best
8,1, 2, 3, 7, 4, 5, 6
1, 4, 6, 7, 8, 2, 3, 5
1, 8, 2, 3, 4, 5, 6, 7
2
1
3
5
6
4
7
12
Distance Measures
© 2012 Prof. Dr. Franz J. Brandenburg
Given: a partial order π and a total (partial) order t
What is their distance?
nearest neighbor distance
KNN(π, t) = min {K(s, t) | s is an extension of π}
Interpretation: the positive view,
there is some extension of π at distance ≤ k
Hausdorff (farthest neighbor) distance
KFN(π, t) = max {K(s, t) | s is an extension of π}
Interpretation: the negative view
all extensions are within distance ≤ k
breadth first
heap (min-heap)
best
8,1, 2, 3, 7, 4, 5, 6
1 ,4, 6, 7, 8, 2, 3, 5
1, 8, 2, 3, 7, 4, 5, 6
K( , id) = 10
K( ,id) = 11
K( , id) = 6
13
Distance Measures
© 2012 Prof. Dr. Franz J. Brandenburg
nearest neighbors = closest red-blue pair or min min
farthest neighbors = closest red-blue pair or max max
Hausdorff distance = max {min distance{red,blue}}
center distance
= min {max distance{red,blue}}
14
Measures
© 2012 Prof. Dr. Franz J. Brandenburg
Hausdorff distance (Felix Hausdorff 1968-1942) is a metric.
nearest neighbor, farthest neighbor, center distance are not !
since d(X,Y) = 0 does not imply X=Y
and no triangle inequality
In R2, for points p = (x,y)
all four distances are in Q(n log n).
15
for Partial Orders
© 2012 Prof. Dr. Franz J. Brandenburg
1 2 3 4 5 6 7 8
8
8 1 2 3
1
7 4 5 6 (breadth first)
2
Kendall-tau = 10
Spearman = 18
3
6
5
7
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 4 6 7
1 8 2 3
8 2 3 5 (min-heap)
Kendall-tau = 11
Spearman = 22
4
4 5 6 7 (opt)
Kendall-tau = 6
Spearman = 12
16
Applications
© 2012 Prof. Dr. Franz J. Brandenburg
•
ranking problems
– in sport:
who is the champion
ranking
– in metasearch
aggregate data from several search engines
top-k lists
– voring systems
17
Sport
© 2012 Prof. Dr. Franz J. Brandenburg
•
Who was the best Formula 1 driver 2011?
• Sebastian Vettel, Ger
• Jenson Button, Eng
• Mark Webber, Aus
392 pts
270 pts
258 pts
Schema: weighted Borda scores, sum points (25,18,...,1)
•
Who is the best tennis player? The best possible ranking?
– the winner of the Australian open?
but Djokovic did not play Federer?
– the aggregate winner of the for Grand Slams?
evaluate the data from four trees: incomplete data
The ranking list is by weighted Borda scores
•
An alternative: US-sports
• phase 1: scores
• phase 2: finals
18
Meta Search
© 2012 Prof. Dr. Franz J. Brandenburg
•
meta search machines (incl. google)
aggregate the rankings (results) of many single searchers
searcher_1 = (2,4,3,5,1,...)
searcher_2 = (4,1,3,5,2,...)
searcher_3 = (5,3,2,1,4,...)
e.g.
for hotels, flights, rental cars,...
but in practice many providers do give you the best offer.
They make the $$$
How do they aggregate?
How do they compute the ranking?
the top - k list?
the first page?
19
Rank Aggregation
© 2012 Prof. Dr. Franz J. Brandenburg
The rank aggregation problem:
Given:
a collection of total orders / permutations over n elements
(t1, t2,...,tm)
Problem: the best compromise (Kemeny score)
a permutation or a linear order t* such that
distance(t*, t1, t2,...,tm) = ∑i distance(t*, ti)  MIN
treat every voter as fair as possible (Biedl, B., Deng, Disc. Math 2009)
maxi {distance (t*, ti)}  MIN
As a decision problem:
Given k:
Is there a t* with distance(t*, t1, t2,...,tm) ≤ k ?
20
Facts
© 2012 Prof. Dr. Franz J. Brandenburg
Theorem:
The rank aggregation problem under the Kendall-tau distance
(1) is NP hard
- for many voters
(Bartoldi, Tovey, Trick, 1989)
- for even numbers > 4 (Dwork et al, WWW 2001)
by a complex reduction from feedback arc set
(small corrections by Biedl, Brandenburg, Deng Disc. Math. 2009)
- in the max version (Biedl et al 2009)
(2) in O(n) for two voters (you and your boy/girlfriend/husband...)
... take any of the two or any inbetween (crossing)
21
Facts (2)
© 2012 Prof. Dr. Franz J. Brandenburg
(3) Kemeny score (rank aggregation with Kendal tau) is
fixed parameter tractable for several parameters
Kemeny score
number of candidates
max and average range of candidate positions
(Betzler,Fellows, Guo, Niedermeier, Rosamond, TCS 410(45) 2009)
In contrast:
(4) rank aggregation under the Spearman footrule distance in O(n3)
for any number of voters (and even with local weights)
(Dwork et al, 2001)
by weighted matching:
for i,j = 1,..., n set wi,j = What does it cost to place i at j?
... and matching does the rest.
22
Results on Distances
© 2012 Prof. Dr. Franz J. Brandenburg
Given:
a partial order π and a total order id = (1,...,n)
nearest neighbor distance K(π, id)
a total order s in Ext(π) such that K(s,id) ––> MIN
Theorem (Brandenburg, Gleissner, Hofmeier, Walcom 2012 / J. Comb. Opt. to appear)
The nearest neigbor distance problem is NP-hard,
Kendall tau: by reduction from one-sided crossing minimization
in the version OSCM-4-stars
Spearman: by reduction from clique
fixed
mobile
Idea: the lower level is π with 4 elements per point
no relations between points, and extra blockers and about n2 crossings
23
NP-hard: What‘s next
© 2012 Prof. Dr. Franz J. Brandenburg
Approximation
if we cannot solve the problem exactly (if P ≠ NP)
can we solve it up to some small error
Theorem
The nearest neigbor distance problem
of a partial and a total order
is 2-approximable for the Kendall tau distance
... by a reduction to a constraint feedback arc set problem
on tournaments and an adaptation/improvement of the
3-approximation of Schalekamp/van Zylen
using Quicksort on the feedback arc set problem
is 4-approximable for the Spearman footrule distance
... using the Diaconis-Graham inequality
24
FPT
© 2012 Prof. Dr. Franz J. Brandenburg
a partial order π and a total order t over {1,....,n}
a parameter k
Problem: find an extension s of π in polynomial time such that
– distance(s, t) ≤ k
– or show that all extensions of π have distance at least k+1
Given:
Theorem (Brandenburg, Hofmeier, Gleißner, Walcom 12, J.Comb. Opt)
The distance problems for a partial order π and a total order t
– for nearest neighbor Kendall tau distance
– for nearest neighbor Spearman footrule distance
are fixed parameter tractable
with a linear kernel.
Transform the problem into a small version of size 2k.
25
Intuition
© 2012 Prof. Dr. Franz J. Brandenburg
.... the distance problem is NP-hard
but
Suppose there are 1.000 elements
and k=100.
Then
at most 100 "critical" pairs may cross
at least 800 elements are not involved.
...x ..y... in π
...y ..x... in t
GOAL:
find these "800" elements and remove them
solve the problem only on the "critical" pairs
- naive by exhaustive search on all ≤ k! extensions of π.
TODO: Improve upon the search
26
Our Key: a Derivation
© 2012 Prof. Dr. Franz J. Brandenburg
Given: a partial order π and a total order t over X = {1,....,n}
The derivation t(π) of π in direction t is a binary relation over X
For two elements x,y
let x < y in t(π) by
(i) agreement x < y both in π and t
(ii) overrule
x < y in π but x > y in t
(iii) takeover x  y in π and x < y in t
Example
1 < 4,6,7
1 < 2,3,5,8
8 < 2,3,5
1,4,6,7 < 8
cycle : 8
<
(overrule
2
by agreement
by takeover
by overrule
by takeover
<
4
takeover
< 8
8
2
1
3
5
6
4
7
takeover)
... and only 8 is involved in eight cycles, excluding 1.
27
© 2012 Prof. Dr. Franz J. Brandenburg
Derivation t(π)
Lemma (Brandenburg, Gleissner, Hofmeier Walcom 2012, J. Comb. Opt.)
t(π) is complete, defined for all x,y.
t(π) may have cycles.
A cycle is made from one overrule and two overtakes.
Proof: Completeness is by definition
Cycles by a case analysis for (x,y,z)
If follows from the transitivity of π and t
that an agreement cannot be part of a cycle
and that two overrules x < y, y < z imply x < z
by the transitivity of a partial order
28
Cycle Rule
© 2012 Prof. Dr. Franz J. Brandenburg
Given: a partial order π and a total order t over X = {1,....,n}
a parameter k
Cycle rule for the reduction:
For every element x
remove x and keep k
if x is not in a cycle, in fact not in a triangle x < y < z < x of t(π)
and there is no overrule on x.
Example
There is no cycle and no overrule on 1.
All cycles use 8 -- 2, 8 --3, 8 -- 5
8
2
1
3
5
6
4
7
29
Proof
© 2012 Prof. Dr. Franz J. Brandenburg
Lemma
The cycle rule preserves the Kendall-tau distance.
Proof: (sketch)
Consider a nearest neighbor of s in Ext(π) with K(s, t) ≤ k
For every x which is removed define
pred(x) = {y | y < x in t(π)}
succ(x) = {z | x < z in t(π) }.
Claim 1: There is an extension π* of π such that
y < x
for every y  pred(x) and
x < z for every z  succ(x),
pred(x) x succ(x) in π*
Then x serves as a "separator".
pred(x) x succ(x) in t
otherwise, consider the first y  pred(x) with y < x in t(π) and x < y in π*.
Then ... by some case analysis ...y and its left neighbor can be swapped
which contradict to "y is the first„
This needs some more work (see our papers).
30
Proof
© 2012 Prof. Dr. Franz J. Brandenburg
In the running example,
1 can be removed.
Place 1 at the first position in the extension of π.
8
2
1 8 2 3 4 5 6 7
3
5
1 2 3 4 5 6 7 8
1
6
4
7
then K( ) = 6
31
Kernel
© 2012 Prof. Dr. Franz J. Brandenburg
Lemma
The nearest neighbor Kendall tau distance between π and t is ≤ k
if after the cycle rule
i.e. after the removal of all "separators" x
there is an instance with at most 2k elements which has K(π2k, t2k) ≤ k.
Find this solution by exhaustive search an the 2k! extensions of π2k.
Conclusion
The distance problem is FPT.
TODO
Improve the search for K(π2k, t2k) ≤ k.
32
© 2012 Prof. Dr. Franz J. Brandenburg
some open problems
for parameterized complexity
33
Bucket Orders
© 2012 Prof. Dr. Franz J. Brandenburg
Theorem (Fagin et al 2006)
There are O(n logn) algorithms to compute
the distances (Kendall-tau aud Spearman) between bucket orders.
with x < y and ties x ~ y
The buckets are totally ordered.
Theorem
The rank aggregation problem is
– NP-hard for total orders under Kendall-tau (Dwork et al 2001)
– in P
for many total orders under Spearman (Dwork et al 2001).
– NP-hard for many bucket orders under Spearman
(Brandenburg, Gleissner, Hofmeier FAW-AAAI 2011)
OPEN: Is it FPT?
34
1-planarity
© 2012 Prof. Dr. Franz J. Brandenburg
Definition (G. Ringel, 1965)
A graph G is 1-planar
if each edge is crossed at most once (by all other edges)
Properites
an edge coloring
black with crossings
red x blue
a 6-vertex coloring (Borodin 1984)
#edges < 4n-8 (Pach, Toth 1997, and others)
not closed under edge contraction
there are infinitely many minimal non-1-planar graphs (Korzhik, 2007)
test is NP-hard (Korzhik, Mohar Graph Drawing 2008, LNCS 5166)
35
1-planar + Rotation System
© 2012 Prof. Dr. Franz J. Brandenburg
Definition
a rotation system (embedding) of a graph G = (V,E)
is the cylic order of the edge (neighbors) of v for each vertex v
The crossing pair system of a graph G = (V,E)
is G together with all pairs (e,e‘) of crossing edges.
Lemma
Given a crossing pair system.
Test for 1-planarity is in O(n),
and there is a straight-line drawing of G on a polynomial size grid.
Claim (under work) (Auer, Brandenburg, Gleißner, Reislhuber)
Given a rotation system:
Test for 1-planarity is NP-hard
.... by a reduction from planar 3-SAT
36
Parameterized Complexity
© 2012 Prof. Dr. Franz J. Brandenburg
Given:
a graph G = (V,E)
a parameter k
Problem: Is G 1-planar with at most k pairs of crossing edges?
Given: G with a rotation system and k
Problem: Is G 1-planar with at most k pairs of crossing edges?
Given:
a directed graph G = (V,E)
a parameter k
Problem: Is G upward 1-planar with at most k pairs of crossing edges?
i.e. G has a 1-planar drawing such that all
edges are upward (Y-monontone)
I need your help!
Thank you
37
Download