Uploaded by 陈沈冲

Chapter-4-Slides-Proximity-Graph-Mohammad

advertisement
An Introduction to Proximity Graphs
L. Mathieson2
P. Moscato1
1 School
of Electrical Engineering and Computing
The University of Newcastle, Australia.
2 School of Software
University of Technology Sydney, Australia.
05th Dec. 2017
Outline
An Introduction to Proximity Graphs
1. Introduction
2. Kinds of Proximity Graphs
Minimum Spanning Trees
Relative Neighbourhood Graphs
Gabriel Graphs
β-Skeletons
Delaunay Triangulations
Urquhart Graphs
Sphere of Influence Graphs
Shortest Path Trees
Minimum Steiner Trees
Nearest Neighbour Graphs
Unit Disk Graphs
3. An Example Application to Recommender Systems
4. Conclusion
Introduction
Proximity graphs
Proximity graphs are one of the combinatorial data-miner’s
frontline tools. They allow expression of complex proximity
relationships and are the basis of many other algorithms.
Here we introduce the concept of proximity graphs, present basic
definitions and discuss some of the most common types of
proximity graphs.
Introduction
Distance and Space
I
First natural, informal questions we ask when wish to extract
interesting relationship from dataset is:
“how far are each of these data points from each other?”
I
We must first know how to measure distance before we can
talk about “near” and “far”
I
Once we develop some suitable metric, we can begin to
explore the relationships between point in the dataset in
terms of their proximity.
One of the most natural, flexible and powerful mathematical
tools for expressing an exploring relationships is the graph
I
I
Thus when we wish to express the totality of proximity
relationships in a dataset, proximity graphs are one of the
most expressive and effective tools available.
Basic Definitions and Notations
Distance
Basic Definitions
I
A graph G = (V , E ) consists of a set V of vertices and a set
E ⊆ V × V of edges
I
A simple graph is a graph with no edges of the form (v , v ),
i.e., no self loops, and no multi-edges
I
An undirected graph is a graph where the edge (u, v ) is
identical to the edge (v , u), i.e., the edges do not have a
direction
I
An edge-weighted graph is a graph where each edge has an
associated weight (implicitly numeric)
I
An unweighted graph can be think of having all edge
weights to be equal (say, all equal to 1 for example)
Basic Definitions and Notations
Graphs
I
Assume that we have a set of points (or vertices)
V = {v1 , . . . , vn } in a space over which is defined a distance
measure d : V × V → R.
I
I
The R can be arbitrarily chosen as the range of the function d
any set equipped with a notion of ordering is sufficient
I
However, we can typically find an isomorphism between the
range and some subset of R.
I
d(u, v) is defined for every pair u, v ∈ V
I
However, it is possible to relax d to allow a null value of some
form (∞, or even a sufficiently large number)
For a graph with edge lengths, the direct distance between
two adjacent vertices may not be the shortest distance
between those two vertices
I
I
distance d(u, v) between u and v and the length l(u, v) of
edge (u, v ) are not the same.
Basic Definitions and Notations
Graphs
I
For convenience we define the
complete, weighted graph
on point set V to be the
graph where
I
I
I
V forms the set of vertices
for every pair u, v ∈ V of
distinct vertices, there is an
edge (u, v ) ∈ E
where the weight of (u, v )
is d(u, v ).
I
We will denote this by K [V ].
I
Every proximity graph on V is
a subgraph of K [V ].
Figure 1: The base set of points in a 2D
plane using the Euclidean distance metric.
Minimum Spanning Trees
Definition
I
A path between vertices u1 and uk in a graph is a set of
edges {(u1 , u2 ), (u2 , u3 ), . . . (uk−1 , uk )} where ui 6= uj for each
i, j ∈ {1, . . . , k}.
I
A graph is connected if there is a path between every pair of
vertices in the graph.
把所有的点链接起来,
A tree is a connected graph with no cycles.
I
I
I
Given a graph, a spanning tree is a tree that connects every
vertex in the graph.
A minimum spanning tree (MST) of a graph is a spanning
tree where the sum of weights on the edges is the minimum
over all possible spanning trees.
I
MSTs are a fundamental notion in graph theory and are widely
applicable in many practical areas.
Minimum Spanning Trees
Prim’s algorithm
Algorithms for Computing MST
I To see how an MST
is a prototypical
proximity graph, we
introduce Prim’s
algorithm for
computing MSTs.
Algorithm 1: Prim’s Algorithm ——维基查
Input : A simple, connected, undirected,
edge-weighted graph G = (V , E ).
Output: A MST of G .
Arbitrarily select a vertex v from V .
Initialise a tree T = (W , F ) with W := {v }
and F := ∅.
which edges to add to
the resultant MST
3 while W 6= V do
are based on the next 4
Pick the smallest weight edge (u, v ) from
nearest vertex to
E with u ∈ W and v ∈
/ W (break ties
anything currently in
arbitrarily)
I The selection of
1
2
the tree (line 4).
Add v to W and (u, v ) to F
5
6
end
Minimum Spanning Trees
Definition
I
In general, MSTs are not unique.
I
For example if the weights of all edges in the input graph are
identical, any spanning tree is a MST.
I
However if the weight
of each edge is unique,
the graph has a single
MST.
I
In the case of the
exmaple point set, the
MST, shown in
Figure 2, is unique as a
result of the random
distribution of points.
Figure 2: A MST for the example point set.
Minimum Spanning Trees
Algorithms
Types of MST Computation algorithm
There are a large variety of algorithms for computing minimum
spanning trees:
I
Borůvka’s algorithm [3] Reinvented three further times
under different names [7, 14], most notably as Sollin’s
algorithm [35].
I
Prim’s algorithm, also invented repeatedly [22, 31, 11], first
by Vojtĕch Jarnı́k [22]. The name Prim’s algorithm is however
almost universally used.
I
Kruskal’s algorithm [26]; the reverse-delete algorithm [26];
Karger, Klein and Tarjan’s randomized algorithm [24] and
Chazelle’s soft heap algorithm [6].
Minimum Spanning Trees
Algorithms
I
Those are all polynomial time algorithms and apart from Karger,
Klein and Tarjan’s algorithm, deterministic.
I
I
I
These algorithms also deal with the most general, non-geometric
case
I
I
the input is simply a set of vertices with arbitrary distances defined
between them
We expect the input graph to be K [V ], a dense graph (indeed, the
densest!) with n(n−1)
edges where n = |V |
2
I
I
The randomness in the deterministic algorithm does not affect
solution quality, but reduce the runtime to linear time
The worst case is identical to Borůvka’s algorithm
worst case performance (and for most a best case as well) of at
least O(n2 log n) 很重要
Using the Fibonacci heap data structure, Fredman and Tarjan
developed a deterministic linear time algorithm [17], employing
Prim’s algorithm as a subroutine
I
I
worst case runtime of O(n2 ) in our context
the running time was possible at the cost of a complicated
implementation
Relative Neighbourhood Graphs
Definition
Relative Neighbourhood Graphs (RNG)
I
Relative Neighbourhood Graph (RNG) connects two points (on
the Euclidean plane [39]) if there does not exist a third point that
is closer to both than they are to each other.
I
The edge (u, v) is in the RNG if there is no third point r such
that d(r , u) < d(u, v ) and d(r , v ) < d(u, v ).
I Figure 3 gives the RNG for the
example point set.
I On the Euclidean plane, the RNG
can be computed in O(n log n)
time [37].
I Given a Delaunay triangulation, this
can be reduced to O(n) time [29].
Figure 3: The Relative Neighbourhood
Graph for the example point set.
Relative Neighbourhood Graphs
Example
I
I
When working with suitable spaces we may use an alternate
definition which turns out to be a specific case of a more
general proximity graph model.
Arising from work on geometric graphs, the RNG can be
defined using lune-based β-Skeletons
I
I
I
an edge is added between points u and v if there are no
points of V in the intersection of the two circles centred at u
and v with radius d(u, v )
more strictly, if the intersection of the two closed 2-balls so
contructed is empty of points from V
This can easily be generalised to the intersection of closed
n-balls in n-dimensional space
Gabriel Graphs
Definition
Gabriel Graphs
I
Gabriel Graph was originally
defined for the Euclidean
plane [18].
I
However the idea can easily be
generalised to other spaces where
the distance between each pair of
points is defined.
I
More precisely, given two points u
and v , the edge (u, v ) is in the
Gabriel Graph if for all other
points w ,
d(u, v )2 ≤ d(u, w )2 + d(v , w )2 .
I
Figure 4 gives the Gabriel Graph of
the example point set using the
Euclidean distance.
Figure 4: The Gabriel Graph of the
example point set.
β-Skeletons
Definition
β-Skeletons
I
没讲
I
The β-Skeleton of a set of points is a generalisation of some
of the geometric notions that lead to RNGs and Gabriel
Graphs
Broadly speaking, an edge (u, v ) is in the β-skeleton if no
points lie in a region defined by circles related to the two
points and the value of the parameter β.
I
I
As special cases, we immediately obtain the RNG and Gabriel
Graph.
There are two defintions of β-Skeletons: the circle-based
defintion and the lune-based definition
β-Skeletons
Definition
I
For any two points u and v
I let P
uv be the set of points p such that ∠upv ≥ θ
I edge (u, v ) is in the β-Skeleton if P
uv is empty
I More intuitively, a pair of circles with diameter β · d(u, v ) if β ≥ 1
)
or d(u,v
if β < 1, where u and v lie on the circumference
β
I it is easy to verify that in the Euclidean plane at least, there
are only two such circles
I If β ≤ 1, then the edge is added if there are no points in the
intersection of the circles
I if β ≥ 1, the edge is added if there are no points in the union
of the circles
I The case β = 1 is degenerate, and the two circles are
identical
I
没讲
Circle-Based Definition: The parameter β is a positive real
number that gives the following angle θ:
(
if β ≥ 1
sin−1 β1 ,
θ=
−1
π − sin β, if β ≤ 1
β-Skeletons
β-Skeletons Example
Figure 5: The three basic cases for determining the edges of the β-skeleton in the
circle-based definition.
I
When β ≤ 1 (a), the area of intersection of the two circles
must be free of any other points for the edge (u, v ) to exist.
I
β = 1, (b), gives the degenerate cases where the intersection
is precisely the union, i.e. the two circles are identical.
I
When β ≥ 1 (c), the union of the two circles must be empty.
β-Skeletons
Lune-Based Definition
Lune-Based Definition: Alternatively,
instead of the line segment uv forming
a chord
I
for β ≥ 1 the circles may be placed
such that uv lies on a diameter
of both circles, with u and v on
the boundary of the intersection.
I
That is, the lune is formed from
the intersection of two circles of
centred at
radius β·d(u,v)
2
β
β
(1 − 2 )u + 2 v ) and β2 u + (1 − β2 )v .
I
When β ≤ 1, the definition is
identical to the circle-based
definition.
u
v
Figure 6: The alternate,
lune-based definition when β ≥ 1.
I when β = 1, the circles
coincide.
I As β approaches infinity,
region of intersection
approaches the infinite area
(tangents to the circles at u
and v ).
I As β approaches zero, the
lune approximates the line
segment from between u
and v .
β-Skeletons
β-Skeletons - Lune-Based Definition
I
The two definitions coincide when β ≤ 1 and for larger values
of β
I
没讲
circle-based definition gives a subgraph of the lune-based
definition
I
The Gabriel Graph is precisely the case where β = 1.
I
The RNG is the case β = 2 in the lune-based definition.
In general for two values β1 and β2 with β1 ≤ β2 , the
β1 -Skeleton is a supergraph of the β2 -Skeleton
I
I
I
leading that the RNG is a subgraph of the Gabriel Graph
We note however that the Delaunay Triangulation is not in
general a β-Skeleton.
β-Skeletons
β-Skeletons - Lune-Based Definition
I
I
The naı̈ve algorithm (testing all triples) gives an O(n3 ) time
bound
For suitable cases: where β ≥ 1
I
I
没讲
I
I
In the planar Euclidean: we may also compute the β < 1
case in O(n2 ) time [21]
I
I
I
we know that the β-Skeletion is a subgraph of the Gabriel
Graph
hence a subgraph of the Delaunay Triangulation, giving an
O(n log n) algorithm for constructing the β-Skeleton
again O(n) where the triangulation is given
There is no better upper bound possible, as cases which
produce the complete graph, and hence require Ω(n2 )
comparisons, exist.
The β-Skeleton was proposed by Kirkpatrick & Radke [25]
based on the idea of α-Shapes [13].
Generalisations using shapes other than circles also exist [5,
41].
Delaunay Triangulations
Definition
Delaunay Triangulations
A Delaunay Triangulation of a set of points (defined on Euclidean plane)
is a triangulation where for every triangle, the circumcircle contains no
other points.
I
没讲
A Delaunay Triangulation1 is built from a set of points in a similar
manner to the Gabriel Graph, however three points are used
instead of two.
I
I
A triangulation of a set of points is any set of edges between the
points such that for any set of edges that form a polygon.
if there are more than 3 points, the polygon has a chord.
Note that the definition is implicitly recursive, so the resulting graph
is a collection of incident triangles.
I There are some sets of points for which the Delaunay
Triangulation is not unique
I
I
when generalising to higher dimensions or non-Euclidean metrics,
a Delaunay Triangulation may not exist
1
Note also that the Voronoi Diagram of a set of points in general position is the dual graph of the Delaunay
Triangulation.
Delaunay Triangulations
Delaunay Triangulations Example
Figure 7: The Delaunay Triangulation of the example point set. The dashed lines
indicate the complimentary Voronoi Diagram.
找到最近的2个点,连线之后取中垂线
Delaunay Triangulations
Definition
I
The Delaunay Triangulation, as mentioned, contains the Gabriel
Graph, the RNG and hence the MST as subgraphs when using
suitable metrics, and in the planar Euclidean case in particular
I
In the Euclidean Plane each point has on average six surrounding
triangles.
Combining these facts gives the simple O(n)-time algorithms for
finding RNGs and Gabriel Graphs when given a Delaunay
Triangulation.
I There are a number of algorithms for computing a Delaunay
Triangulation
I
Edge Flipping algorithms to compute Delaunay Triangulation
If two triangles that share an edge, do not satisfy the Delaunay condition
I (i.e. all four points are in the circumcircle of at least one of the triangles)
then the common edge can be “flipped” by swapping it for the other potential shared
edge
I i.e. if the triangles are on points w , x, y and x, y , z respectively,
I then the common edge (x, y ) can be flipped to the new common edge (w , z).
Delaunay Triangulations
Algorithms
The resulting triangle pair will now satisfy the Delaunay
condition.
I This prompts a simple algorithm: construct any triangulation,
then proceed to flip edges until no triangle violates the Delaunay
condition.
I
I
I
I
Alternatively, we may begin with a single triangle and incrementally
add points to the triangulation, repairing the graph at each step via
edge flipping.
I
I
I
This leads to an O(n2 ) algorithm, and can be generalised to higher
dimensions [10]
however the higher dimensional version may not converge —
while the flip operation remains valid, it may disconnect the graph.
If done without care, this leads to an O(n2 ) algorithm
However, using a random insertion order and suitable data
structures to quickly find triangles that need repairing in O(log n)
time taken over all vertices this gives a O(n log n) time
algorithm [44].
This algorithm can be safely generalised to higher dimensions [12]
I
however the running time may be exponential in the dimension
Delaunay Triangulations
Algorithms
I
The Bowyer-Watson algorithm uses a similar incremental
approach
I
I
I
I
adding a point to the triangulation at each step
instead of flipping edges, it removes all conflicting triangles
and retriangulates the polygonal hole generated.
The algorithm was discovered independently by Bowyer [4]
and Watson [43]2 .
The algorithm can triangulate a set of n points in O(n log n)
time in any dimensionality
I
special cases exist which lead to O(n2 ) running time [32]
2
It is interesting to note that not only was it discovered concurrently, but both articles were published in the
same issue of the Computer Journal.
Delaunay Triangulations
Algorithms
I
In two dimensions, the most efficient [36] algorithm employs
a divide-and-conquer approach
I
I
I
the set of points is recursively split in two by a line
the Delaunay Triangulation is computed for each partite set
the two sets are merged along the separating line.
I
The merging of the sets can be performed in linear time,
giving a worst case running time of O(n log n).
I
A divide and conquer approach was first proposed by Shamos
& Hoey [34] and subsequently improved upon [28, 19, 27].
I
Divide and conquer approaches can also be employed in
higher dimensions [8] with good empirical performance, but
no theoretical reduction in complexity.
Urquhart Graphs
Definition
Urquhart Graphs
The Urquhart Graph of a set of points is obtained by removing
the longest edge from each triangle in the Delaunay
Triangulation graph.
I
带过,维基
Originally suggested [40] as a possible way of quickly
computing the RNG
I
It was later demonstrated [38] that the resultant graph was
not always the RNG of the set of points, it however remains
an efficient approximation although somewhat supercede by
later results [37].
I
As might be expected by this point, the Urquhart Graph
contains the minimum spanning tree of the point set.
Sphere of Influence Graphs
Definition
Sphere of Influence Graphs
Given a set of points P, for each point v we find the nearest neighbour
nv , and draw a circle centred at v with radius d(v , nv ). The edge
(u, v ) is added:
I
If the circles for any two points u and v intersect
I
Equivalently, if the distance between u and v is less than
d(v , nv ) + d(u, nu )
I
The Sphere of Influence Graph retains aspects of the β-Skeleton
approach, but essentially inverts the edge existence condition.
I
Naturally, this graph includes the Nearest Neighbour Graph as a
subgraph
I but is not guaranteed to include the MST as a subgraph
I in fact, may not even be connected
没讲
Sphere of Influence Graphs
Example
没讲
Figure 8: Sphere of Influence Graph for the example point set.
Shortest Path Trees
Definition
I
While the Minimum Spanning Tree gives a minimally connected
subgraph3 of total minimum weight, we may prefer instead to
minimise path lengths.
I
just if we are interested in distances in particular, say for modeling
travel times or resource allocation
Given a nominated starting vertex, we can build a tree which
captures the shortest path from that vertex to every other vertex.
I We can efficiently construct a Shortest Path Tree in that manner.
I
Shortest Path Trees
Given the distances from a nominated starting vertex v to every other
没讲
vertex, which can be computed using any suitable algorithm.
I then for every other vertex u select a parent p for u out of
N(u) such that d(v , p) + l(p, u) = d(v , u)
I a tie-break rule may be needed, such as selecting the parent
with the smallest number of edges to v along its shortest path
Assuming initial graph is connected, the Shortest-path tree is
guaranteed to exist, but may not be unique, much like MSTs.
3
In a case of a set of points, of the implicit complete graph.
Minimum Steiner Trees
Definition
Minimum Steiner Trees
A Steiner Tree of a set of points P is a minimum spanning tree of an
augmented set of points P ∪ S such that the points in S (the Steiner
points) are added to reduce the total weight of the tree.
I
We can view it in purely graph theoretic terms as:
“given an edge-weighted graph G with a nominated set of required
vertices P, a Steiner Tree is a tree in G spans P, the additional vertices
of G that are used in the tree constitute S”
没讲 I In Euclidean plane, the input graph is ‘simply’ the implicit infinite
complete graph consisting in the set of point.
I The Steiner points are Fermat points [20], which leads to the
observation that at most n − 2 Steiner points are needed
I However, for more general cases, no such bound necessarily
holds.
I Unlike the other proximity graphs mentioned so far, the minimum
weight Steiner Tree is NP-hard to compute, even on the
Euclidean plane.
Nearest Neighbour Graphs
Definition
Nearest Neighbour Graphs
The Nearest Neighbour Graph (NNG) of a set of points is the
smallest collection of edges such that each point is connected to all its
nearest points
I
Given a set of points P, a point q is a nearest point of point p if
d(p, q) = minr ∈P {d(p, r )}.
I
Although the NNG is often presented as an undirected graph, the
property of being a nearest neighbour is not symmetric, and thus
the NNG is more strictly a directed graph.
I
The NNG is a subgraph of the Gabriel Graph and hence the
Delaunay Triangulation.
I
If the number of nearest neighbours chosen is restricted to one, the
NNG is a subgraph of the MST.
I
The full NNG may contain cycles however.
Nearest Neighbour Graphs
Definition
I
As would be expected, the NNG is not typically connected, but
nonetheless finds many uses in a number of optimization fields.
I
The NNG naturally generalises to the k-Nearest Neighbour
Graph (k-NNG)
I
I
where each vertex is connected to a set of nearest points which
are at a distance less than or equal to the k th nearest distance.
Figure 9 shows the 1-NN, 3-NN and 6-NN for the example point set.
Figure 9: The k-Nearest Neighbour Graphs for the example point set with
k ∈ {1, 3, 6}.
I
For the 1-NN it is easy to verify that it is a subset of the MST.
Unit Disk Graphs
Definition
Unit Disk Graph
The Unit Disk Graph (UDG) [9] is a simple geometric intersection
graph, similar to a simplified Sphere of Influence Graph.
I two points u and v are connected if d(u, v ) ≤ c, for some chosen
constant c 4
I Equivalently, this may be considered as the intersection graph of the set
of radius c circles centred at the given points.
没讲
I
This is one of the simplest proximity graphs, but still finds
natural applications in computer networking, but more
interestingly as models of random graphs associated with
percolation phenomena.
I
The UDG of a set of points can be computed in linear time [2], for
any space of a fixed dimension, with suitable generalisation to
spheres.
4
c being the length of the ‘unit’ in the graph’s title.
Unit Disk Graphs
Example
没讲
Figure 10: Disk Graph for the example point set. For the example the unit length was
taken as 0.2.
An Application to Recommender Systems
The set of 5-star hotels in Venice, Italy.
I
Clearly the range of proximity graphs spans a wide array of
concepts and provides a significant degree of flexibility for
extracting interesting properties and information from a data
set.
I
We briefly explore the utility of proximity graphs through the lens of
Recommender Systems:
I
I
If we have a set of objects with a variety of properties, and we wish
to provide a method for recommending other similar objects from
the set given a choice of one or several objects from the set.
For an example: on music or book websites; when a customer
purchases an item, the system recommends other items that they
may find interesting.
An Example Application to Recommender
Systems
I
Here we present a simple example using hotels in Venice, Italy,
drawn from the travel website tripadvisor.com.
I
I
Reduced the total set of hotels to only the 18 five star hotels
Each hotel is rated by visitors and given from one to five stars5 .
I We Normalised count of each
rating category by the total
number of ratings
I
I
took each of these proportions
as a dimension
we got set of points in
five-dimensional unit
hypercube
I Using this point set we
constructed a number of the
proximity graphs mentioned in
this chapter, presented in
Figure 12.
5
Figure 11: The set of 5-star hotels in
Venice, Italy. The arrangement is a
2-dimensional projection of the underlying
5-dimensional ratings data set.
Not to be confused with the amenity based star rating system!
An Application to Recommender Systems
The set of 5-star hotels in Venice, Italy.
Figure 12: Proximity Graphs constructed on the set of points for the Venetian hotels
tripadvisor. The x axis is the proportion of “Excellent reviews”, the y axis is the proportion
of “Very Good” reviews. a) The Minimum Spanning Tree for the data set. We note that for this particular data
sets, the Relative Neighbourhood Graph, the Minimum Spanning Tree and the Shortest Path Graph happen to coincide. b)
The 1-Nearest Neighbour Graph for the data set. c) The Sphere of Influence Graph for the data set. d) The 3-Nearest
Neighbour Graph for the data set. e) The Unit Disk Graph for data set, with disk radius set to 0.05.
An Application to Recommender Systems
The Minimum Spanning Tree (MST)
I
Unsurprisingly the MST provides only
limited utility for our purposes
I
I
where each vertex at least have
one other neighbour which is
‘close’
but, could be recommended as
similar
后⾯的都没讲
Figure 13: The Minimum Spanning Tree
An Application to Recommender Systems
The 1-Nearest Neighbour Graph
I
The 1-Nearest Neighbour Graph
(Figure 12b), although a subgraph of
the MST
I
I
perhaps provides a better basis for
recommendations as it separates
the vertices into connected
components
which we can take as clusters of
similar hotels
Figure 14: The 1-Nearest Neighbour Graph
An Application to Recommender Systems
The Sphere of Influence Graph
I
TThe Sphere of Influence Graph
(Figure 12c) produces an output
similar to the 1-NN
I
I
but with larger connected
components
which could provide more flexibility
and range in the recommendations
possible.
Figure 15: The 1-Nearest Neighbour Graph
An Application to Recommender Systems
The 3-Nearest Neighbour Graph
I
The 3-NN graph takes a further step in this direction
I
It indicates that there is a fairly clear distinction between two
groups of hotels within this group
I these graphs guarantee that each vertex
will have at least one neighbour
I
a possibly desirable property that a
recommendation can always be
provided
I however this has the caveat that for data
points that are truly unique
I
hence, the recommendation will be
extremely poor
I in particular the two extrema in the top
left and bottom right corners
I
if these two dimensions constituted the
full underlying data set, their inclusion
with any other vertex in a cluster
would be dubious.
Figure 16: The 1-Nearest Neighbour Graph
An Application to Recommender Systems
The Unit Disk Graph
I
The Unit Disk Graph provides some
remedy to this, although with
drawbacks of its own.
I
I
I
For this example the disk radius
was manually selected as 0.05
in general it is not necessarily clear
how to select the radius
However in this case:
I
provides a series of clusters, but
always excludes long edges, a
desirable property for recommender
systems.
Figure 17: The Unit Disk Graph
An Application to Recommender Systems
General Remarks
I
We can use the proximity graphs generated as a simple
recommender system, or as part of a more complex system.
I
Of course here we employ a simplified model; a fuller model would
include other aspects such as location, price and availability.
I
Nonetheless, when choosing a hotel, it is reasonable that the
customer would be interested in seeing similar hotels, particularly
with regards to the quality of the hotel.
I
The visitor ratings provide one measure of a hotel’s quality and
standing.
I
Thus we can explore the idea that hotels that are ‘nearby’ in
ratings might be of interesting to the hypothetical customer.
I
This naturally leads us to proximity graphs, which give a variety
of ways of expressing ‘nearness’ in multifaceted data sets.
Conclusion
I
While we have briefly covered a number of proximity graphs, the
topic includes much more than we can include here.
I
Geometric graph theory in particular is a fertile ground for proximity
graph definitions, and optimization provides a great number of
applications, particular as components in larger algorithms.
I
One unfortunate aspect however is the lack of a coherent
presentation of detailed material on proximity graphs, with the
literature scattered peicemeal across a number of fields.
References I
[1]
Richard Bellman. “On a routing problem”. In: Quarterly of Applied
Mathematics 16 (1958), pp. 87–90.
[2]
Jon L. Bentley, Donald F. Stanat, and E.Hollins Williams. “The complexity of
finding fixed-radius near neighbors”. In: Information Processing Letters 6.6
(1977), pp. 209–212.
[3]
Otakar Borůvka. “O jistém problému minimálnı́m”. In: Práce Moravské
pr̆ı́rodovĕdecké. Spolec̆nosti 3.3 (1926), pp. 37–58.
[4]
A. Bowyer. “Computing Dirichlet tessellations”. In: The Computer Journal
24.2 (1981), pp. 162–166.
[5]
Jean Cardinal, Sébastian Collette, and Stefan Langerman. “Empty region
graphs”. In: Computational Geometry Theory & Applications 42.3 (2009),
pp. 183–195.
[6]
Bernard Chazelle. “A Minimum Spanning Tree Algorithm with
inverse-Ackermann Type Complexity”. In: Journal of the ACM 47.6 (2000),
pp. 1028–1047.
[7]
Gustave Choquet. “Étude de certains réseaux de routes”. In: Comptes-rendus
de l’Académie des Sciences 206 (1938), pp. 310–313.
[8]
P. Cignoni, C. Montani, and R. Scopigno. “DeWall: A fast divide and conquer
Delaunay triangulation algorithm in E d ”. In: Computer-Aided Design 30.5
(1998), pp. 333–341.
References II
[9]
Brent N. Clark, Charles J. Colbourn, and David S. Johnson. “Unit disk
graphs”. In: Discrete Mathematics 86.1 (1990), pp. 165–177.
[10]
Jesús De Loera, Jörg Rambau, and Francisco Santos. “Triangulations,
Structures for Algorithms and Applications”. In: Algorithms and Computation
in Mathematics. Vol. 25. Springer, 2010.
[11]
Edsger W. Dijkstra. “A note on two problems in connexion with graphs”. In:
Numerische Mathematik (1959), pp. 269–271.
[12]
H. Edelsbrunner and N. R. Shah. “Incremental topological flipping works for
regular triangulations”. In: Algorithmica 15.3 (1996), pp. 223–241.
[13]
Herbert Edelsbrunner, David G. Kirkpatrick, and Raimund Seidel. “On the
shape of a set of points in the plane”. In: IEEE Transactions on Information
Theory 29.4 (1983), pp. 551–559.
[14]
Kazimierz Florek. “Sur la liaison et la division des points d’un ensemble fini”.
In: Colloquium Mathematicum 2 (1951), pp. 282–285.
[15]
Robert W. Floyd. “Algorithm 97: Shortest Path”. In: Communications of the
ACM 5.6 (1962), p. 345.
[16]
L. R. Ford. Network Flow Theory. Tech. rep. P923. Santa Monica, USA:
RAND Corporation, 1956.
References III
[17]
Michael L. Fredman and Robert Endre Tarjan. “Fibonacci Heaps and Their
Uses in Improved Network Optimization Algorithms”. In: Journal of the ACM
34.3 (1987), pp. 596–615.
[18]
K. R. Gabriel and R. R. Sokal. “A new statistical approach to geographic
variation analysis”. In: Systematic Zoology 18 (1969), pp. 259–278.
[19]
Leonidas Guibas and Jorge Stolfi. “Primitives for the Manipulation of General
Subdivisions and the Computation of Voronoi”. In: ACM Transactions on
Graphics 4.2 (1985), pp. 74–123.
[20]
.“Fermat-Torricello problem”. In: Encyclopedia of Mathematics. Ed. by
Michiel Hazewinkel. Springer, 2001.
[21]
Ferran Hurtado, Giuseppe Liotta, and Hank Meijer. “Optimal and suboptimal
robust algorithms for proximity graphs”. In: Computational Geometry Theory
& Applications 25.1–2 (2003), pp. 35–49.
[22]
Vojtĕch Jarnı́k. “O jistém problému minimálnı́m”. In: Práce Moravské
pr̆ı́rodovĕdecké. Spolec̆nosti 6.4 (1930), pp. 57–63.
[23]
Donald B. Johnson. “Efficient Algorithms for Shortest Paths in Sparse
Networks”. In: Journal of the ACM 24.1 (1977), pp. 1–13.
[24]
David R. Karger, Philip N. Klein, and Robert E. Tarjan. “A Randomized
Linear-time Algorithm to Find Minimum Spanning Trees”. In: Journal of the
ACM 42.2 (1995), pp. 321–328.
References IV
[25]
David G. Kirkpatrick and John D. Radke. “A framework for computational
morphology”. In: Computational Geometry, Machine Intelligence and Pattern
Recognition. Vol. 2. Amsterdam: North-Holland, 1985, pp. 217–248.
[26]
Joseph B. Kruskal. “On the shortest spanning subtree of a graph and the
traveling salesman problem”. In: Proceedings of the American Mathematical
Society 7 (1956), pp. 48–50.
[27]
Geoff Leach. “Improving Worst-Case Optimal Delaunay Triangulation
Algorithms”. In: Proceedings of the 4th Canadian Conference on
Computational Geometry. 1992, p. 15.
[28]
D. T. Lee and B. Schachter. “Two Algorithms for Constructing Delaunay
Triangulations”. In: International Journal of Computer and Information
Sciences 9.3 (1980), pp. 219–242.
[29]
A. Lingas. “A linear-time construction of the relative neighborhood graph from
the Delaunay triangulation”. In: Computational Geometry 4.4 (1994),
pp. 199–208.
[30]
Edward F. Moore. “The shortest path through a maze”. In: Proceedings of
the International Symposium on Switching Theory 1957, Part II. Harvard
Univ. Press, Cambridge, Mass., 1959, pp. 285–292.
[31]
Robert C. Prim. “Shortest connection networks And some generalizations”.
In: Bell System Technical Journal 36.6 (1957), pp. 1389–1401.
References V
[32]
S. Rebay. “Efficient Unstructured Mesh Generation by Means of Delaunay
Triangulation and Bowyer-Watson Algorithm”. In: Journal of Computational
Physics 106.1 (1993), p. 127.
[33]
Bernard Roy. “Transitivité et connexité”. In: Comptes rendus de l’Académie
des sciences 249 (), pp. 216–218.
[34]
M. Shamos and D. Hoey. “Closest-point problems”. In: Proceedings of the
16th Annual IEEE Symposium on Foundations of Computer Science. 1975,
pp. 151–162.
[35]
M. Sollin. “Le tracé de canalisation”. In: Programming, Games, and
Transportation Networks (1965).
[36]
Peter Su and Robert L. Scot Drysdale. “A Comparison of Sequential Delaunay
Triangulation Algorithms”. In: Proceedings of the Eleventh Annual
Symposium on Computational Geometry. Vancouver, British Columbia,
Canada: ACM, 1995, pp. 61–70.
[37]
K. J. Supowit. “The relative neighborhood graph, with an application to
minimum spanning trees”. In: Journal of the ACM 30.3 (1983), pp. 428–448.
[38]
G. T. Toussaint. “Comment: Algorithms for computing relative neighbourhood
graph”. In: Electronics Letters 16.22 (1980), p. 860.
[39]
G. T. Toussaint. “The relative neighborhood graph of a finite planar set”. In:
Pattern Recognition 12.4 (1980), pp. 261–268.
References VI
[40]
R. B. Urquhart. “Algorithms for computation of relative neighbourhood
graph”. In: Electronics Letters 16.14 (1980), p. 556.
[41]
Remko C. Veltkamp. “The γ-neighborhood graph”. In: Computational
Geometry Theory & Applications 1.4 (1992), pp. 227–246.
[42]
Stephen Warshall. “A Theorem on Boolean Matrices”. In: Journal of the
ACM 9.1 (1962), pp. 11–12.
[43]
D. F. Watson. “Computing the n-dimensional Delaunay tessellation with
application to Voronoi polytopes”. In: The Computer Journal 24.2 (1981),
pp. 167–172.
[44]
Mark de Berg et al. “Computational Geometry: Algorithms and Applications”.
In: Spring-Verlag, 2008. Chap. Delaunay Triangulations: Height Interpolation.
Download