This is an example of a bad talk

advertisement
This is an example of a bad talk
(Disclaimer: The paper that should
have been presented in this talk is a
classic in the field, a great paper: this
talk, not the paper, is rotten).
On the Foundations of Relaxation
Labeling Processes
By An Anonymous Student
Overview
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Motivation
I. Introduction to Labeling Problems
II. Continuous Relaxation Labeling Processes
III. Consistency
IV. Overview of Results
V. Average Local Consistency
VI. Geometric Structure of Assignment Space
VII. Maximizing Average Local Consistency
VIII. The Relaxation Labeling Algorithm
IX. A Local Convergence Result
X. Generalizations to Higher Order Compatibilities
XI. Comparisons with Standard Relaxation Labeling Updating Schemes
XII. Summary and Conclusions
Appendix A
Motivation
• Two concerns:
– The decomposition of a complex computation
into a network of simple “myopic”, or local,
computations
– The requisite use of context in resolving
ambiguities
Motivation
• Relaxation operations: To solve systems of
linear equations, etc.
• Relaxation labeling:
– Extension of relaxation operations
– Solutions involve symbols rather than functions.
– Assign weights attached to labels
• Main difference: Labels do not necessarily
have a natural ordering
Motivation
• Algorithm:
– Parallel
– Each process makes use of the context to assist
in a labeling decision
• Goal
– Provide a formal foundation
• Characterize of what the algorithm is doing to
attribute the cause of failure to an inadequate theory
Motivation
• Treatment
– Abstract
• To relate discrete relaxation to a description of the
usual relaxation labeling schemes
• To develop a theory of consistency
• To formalize its relationship to optimization
• Several mathematical results
I. Introduction to Labeling Problems
• In a labeling problem, one is given:
–
–
–
–
A set of objects
A set of labels for each object
A neighbor relation over the objects
A constraint relation over labels at pairs (or n-tuples) of
neighboring objects
• Solution: An assignment of labels to each object in
a manner which is consistent with respect to the
constraint relation
I. Introduction to Labeling Problems
• λ: Variable to either denote a
label or to serve as an index
through a set of labels.
• Λi : Set of labels attached to
node i
• Λij : Constraint relation listing
all pairs (λ,λ’) such that λat i is
consistent with λ’ at j
• m : Number of labels in Λi
• n : Number of nodes in G
• Si (λ) : Support function for
label λon i from a discrete
labeling (count the number of
neighbors of an object i which
has labels compatible to a given
label λat i)
• Max used because more than
one label can be 1 at j.
I. Introduction to Labeling Problems
• Discrete relaxation
– label discarding rule: discard a label λat a node i if there
exists a neighbor j of i such that every label λ’ currently
assigned to j is incompatible with λ at i (
for
all λ’ assigned to j).
– A label is retained if at every neighboring node there
exists at least one compatible label.
II. Continuous Relaxation Labeling Processes
• Limit in I:
– Pairs of labels are either compatible or completely
incompatible
– Can’t express a preference or relative dislike
• Solution:
– Continuous relaxation labeling
– Weighted values representing relative preferences
II. Continuous Relaxation Labeling Processes
• Compatibility rij(λ,λ’) : relative support for label λat object i
that arises from label λ’ at object j.
– Positive: locally consistent pair
– Negative: implied inconsistency
– Magnitude of rij(λ,λ’) is proportional to the strength of the
constraint
– i and j are not neighbors: rij(λ,λ’) = 0
II. Continuous Relaxation Labeling Processes
• Difficulty: Formulating a consistent labeling
– A consistent labeling is one in which the
constraints are satisfied
– Logical constraints replaced by weighted
assertions: A new foundation is required to
describe the structural framework and the
precise meaning of the goal of consistency
II. Continuous Relaxation Labeling Processes
• Structural frameworks attempted:
– Define consistency as the stopping points of algorithm
• Circular, no clue
– Regard the label weights as probabilities, use Bayesian analysis,
statistical quantities, etc.
• Unsuccessful, various independence assumptions required
– Optimization theory: a vector composed of the current label
weights, an evidence vector involving each label’s neighborhood
weights
• Authors extended it
– Linear programming: constraints are obtained from arithmetical
equivalents, preferences can be incorporated only by adding new
labels
• Different, interesting and not incompatible with authors’ development
II. Continuous Relaxation Labeling Processes
• Prototype (original) algorithm:
– An iterative , parallel procedure analogous to the label
discarding rule used in discrete relaxation
– For each object and each label, one computes
(as support function)
using the current assignment values pi(λ). Then new
assignment values are defined according to
III. Consistency
• Require a system of inequalities
• Permit the logical constraints to be ordered,
or weighted
• Allow an analytic, rather than logical or
symbolic, study
• Definition of consistency:
– For unambiguous labelings
– For weighted labeling assignments
III. Consistency
• Unambiguous
• Space of unambiguous labelings:
labeling assignment:
A mapping from the
set of objects into
the set of all labels,
each object is
associated with
exactly one label
III. Consistency
• Weighted labeling assignments: replace
by the condition
• K is simply the convex hull of K*
III. Consistency
• Consistency depends on constraints between label
numbers: the compatibility matrix, elements of
which indicate both positive and negative
constraints.
• Definition 3.1: Labeling spaces require
,
so replace max with a sum in support function
(linear) (refer to I)
III. Consistency
• Higher order combinations of object labels:
– Multidimensional matrix of compatibilities:
– Support at object i for label λ:
• Definition 3.2: The unambiguous labeling
consistent providing
is
• Consistency in K* corresponds to satisfying a system of
inequalities:
III. Consistency
• At a consistent unambiguous labeling, the support,
at each object, for the assigned label is the
maximum support at that object.
• Given a set of objects, labels, and support
functions, there may be many consistent labelings.
• Condition for consistency in K* (restate)
III. Consistency
• Definition 3.3: Condition for consistency for weighted
labeling assignment
• Definition 3.4: Condition for strictly consistency (for )
• An unambiguous assignment that is consistent in K will
also be consistent in K*, since
. The converse is
also true (3.5).
III. Consistency
• Proposition 3.5: An
unambiguous
labeling which is
consistent in K* is
also consistent in K.
IV. Overview of Results
• Algorithm for converting a given labeling into a consistent
one:
– Two approaches:
• Optimization theory
• Finite variational calculus
– Lead to the same algorithm
• Achieving consistency is equivalent to solving a variational
inequality:
•
IV. Overview
of Results
• Two paths to
study consistency
and derive
algorithms for
achieving it.
V. Average Local Consistency
• Goal: Update a nearly consistent labeling to a consistent
one
•
should be large =>
should be large =>
• Average local consistency
should be large.
– Two problems:
• Maximizing a sum doesn’t necessarily maximize each individual terms
• The individual components si(λ) depend on , which varies during
the maximization process.
V. Average Local Consistency
• Maximizing
is the same as maximizing
,which is not the same as maximizing the
n quantities
•
V. Average Local Consistency
• Special case: the compatibility matrix is symmetric,
maximizing
leads to consistent labeling
assignments.
• General case: the compatibility matrix is not
symmetric. VIII will figure out algorithm.
– Locally maximizes
symmetrized.
is the same as if the matrix is
V. Average Local Consistency
• Gradient ascent: to find local maxima of a smooth
functional
, which successively move the
current by a small step to a new .
• The amount of increase in
is related to the
directional derivative of A in the direction of step.
• The gradient :
V. Average Local Consistency
• When the compatibilities are symmetric:
•
•
(cmp Dfn 3.1)
: intermediate updating “direction”
VI. Geometric Structure of Assignment Space
• Goal: To discuss gradient ascent on K, and to visualize the
more general updating algorithms.
• A simple example: 2 (n) objects, with 3 (m) possible labels
for each object (2 - simplex)
VI. Geometric Structure of Assignment Space
• Vector
: two points, each lying in a copy of the
space shown in Fig.2.
• K: set of all pairs of points in two copies of the triangular
space in Fig.2
• K with n objects each with m labels:
– Space: n copies of an (m-1)-simplex
– K: set of all n-tuples of points, each points lying in a copy of the
(m-1)-dimensional surface
– A weighted labeling assignment is a point in the assignment space
K.
– An unambiguous labeling: one of the “corners”
– Each simplex has m corners
VI. Geometric Structure of Assignment Space
• Tangent space: A surface lies “tangent” to
the entire surface if place it at the given
point, means the set of all directions
– K and tangent space are coincide when initiate
– Interior of a surface: a vector space
– Boundary of surface: a convex subset of a
vector space
VI. Geometric Structure of Assignment Space
• : A labeling assignment in K
• : Any other assignment in K
• Difference vector (direction):
VI. Geometric Structure of Assignment Space
• Set of all tangent vectors at (surface)(
around K):
• Set of tangent vectors at the interior point
consists of an entire subspace:
roams
VI. Geometric Structure of Assignment Space
•
lies on a boundary of K: a proper subset of above
space:
VII. Maximizing Average Local Consistency
• To find a consistent labeling:
– Constraints are symmetric: Gradient ascent
– Constraint are not symmetric: same algorithm (VIII)
– The increase in
due to a small step of length αin
the direction ū is approximately the directional
derivative:
||u|| = 1 (the greatest increase in
can be expected if a
step is taken in the tangent direction ū)
VII. Maximizing Average Local Consistency
• To find direction of steepest ascent: grad
should be maximized (solution always exists)
•
VII. Maximizing Average Local Consistency
• Lemma 7.3: If lies in the interior of K, then the following
algorithm solves problem 7.1
– May fail when
in Appendix A)
is a boundary point of K (solved using algorithm
Appendix A. Updating Direction Algorithm
• Give algorithm to replace the updating formulas
in common use in relaxation labeling processes.
• Give projection operator (a finite iterative algo)
based on consistency theory and permitting proof
of convergence results.
• Solution to the projection problem: returned vector
u.
• Normalization: ||ū|| = 1 (or ū = 0)
• Step length: αi
VII. Maximizing Average Local Consistency
•
Algorithm 7.4: find consistent labelings when the matrix of compatibilities is
symmetric
– Successive iterates are obtained by moving a small step in the direction of the
projection of the gradient
– Algorithm stops when the projection
=0
VII. Maximizing Average Local Consistency
• Proposition 7.5: Suppose is a stopping point of
Algo 7.4, then if the matrix of compatibilities is
symmetric, is consistent.
VIII. The Relaxation Labeling Algorithm
• Previous entire analysis of average local
consistency relies on the assumption of symmetric
compatibilities.
• Example: constraints between letters in English
• Theorem 4.1 is general (variational inequality)
VIII. The Relaxation Labeling Algorithm
•
• Observation 8.1 With defined as above, the
variational inequality is equivalent to the statement
A labeling is consistent iff
points away from
all tangent directions
• Algorithm 8.2 (The Relaxation Labeling Algorithm)
VIII. The Relaxation Labeling Algorithm
• Proposition 8.3: suppose is a stopping point of
Algo 8.2, then is consistent.
• Questions:
– Are there any consistent labeling for the relaxation
labeling algorithm to find? (Answered by 8.4)
– Assuming that such points exist, will the algorithm find
them? (answered in IX)
– Even if a relaxation labeling process converges to a
consistent labeling, is the final labeling better than the
initial assignment? (not well defined)
VIII. The Relaxation Labeling Algorithm
• Example of English
• Proposition 8.4: The variational inequality
of Theorem 4.1 always has at least one
solution. Thus consistent labelings always
exist, for arbitrary compatibility matrices.
• Usually, more than one solution will exist.
IX. A Local Convergence Result
• As the step size of the relaxation labeling algorithm
7.4 or 8.2 becomes infinitesimal, these discrete
algorithms approximate dynamical system
•
• Hypothesis of 9.1: the labeling at every object is
close to the consistent assignment
IX. A Local Convergence Result
• Assume that is strictly consistent in order to
prove that it’s a local attractor of the relaxation
labeling dynamical system
• If is consistent, but not strictly consistent,
maybe:
– A local attractor of the dynamical system
– A saddle point
– An unstable stopping point
X. Generalizations to Higher Order Compatibilities
• Consistency: be defined using support functions
(depend on arbitrary orders of compatibilities):
– 1-order compatibilities:
– 3-order:
• Symmetry condition:
X. Generalizations to Higher Order Compatibilities
– k-order compatibilities:
• Symmetry condition:
X. Generalizations to Higher Order Compatibilities
• Compatibilities higher than second order, or nonpolynomial compatibilities:
– Difficulty: combinatorial growth in the number of
required computations
– Most implementations of relaxation labeling processes
have limited the computations to second-order
compatibilities
XI. Comparisons with Standard Relaxation
Labeling Updating Schemes
• Algo 8.2: Updates weighted labeling assignments:
then updating in the direction defined by the
projection of onto
• Other two standard formulas for relaxation
labeling:
–
XI. Comparisons with Standard Relaxation
Labeling Updating Schemes
–
• Denominator is a normalization term
• Numerator can be rewritten as:
XII. Summary and Conclusions
• Relaxation labeling processes: Mechanisms for employing
context and constraints in labeling problems.
• Background: Lacking a proper model characterizing the
process and its stopping points, the choice of the
coefficient values and the updating formula are subject
only to empirical justification.
• Achievement: Develop the foundations of a theory that
figures consistency to explaining what relaxation labeling
accomplishes, and leads to a relaxation algorithm with an
updating formula using a projection operator.
XII. Summary and Conclusions
• Discrete relaxation: a label is discarded if it is not
supported by the local context of assigned labels.
• Weighted label assignment: An unambiguous labeling is
consistent if the support for the instantiated label at each
object is greater than or equal to the support for all other
labels at that object.
• Relaxation labeling process defined by Algo 8.2 with the
projection operator specified in Appendix A stops at
consistent labelings.
• Dynamic process will converge to a consistent labeling if
one begins sufficiently near a consistent.
XII. Summary and Conclusions
• Symmetry properties: relaxation labeling
algorithm is equivalent to gradient ascent
using average local consistency function.
• Future work:
– efficient implementations of the projection
operator
– Choice of the step size
– Normalization methods
Thank you
Download