pptx

advertisement
Network Lasso: Clustering and
Optimization in Large Graphs
David Hallac, Jure Leskovec, Stephen Boyd
Stanford University
Presented by Yu Zhao
What is this paper about

Lasso problem
The lasso solution is unique when rank(X) = p, because the criterion is
strictly convex.
What is this paper about

Network lasso problem
The variables are
variables is mp.) Here
cost function at node i, and
with edge
.
, where
. (The total number of scalar
is the variable at node i,
is the
is the cost function associated
Outline




Convex problem definition
Proposed solution(ADMM)
Non-convex extension
Experiments
Convex problem definition
(1)
(2)
Convex problem definition
A distributed and scalable method was developed for solving the network
lasso problem, in which each vertex variable xi is controlled by one
“agent”, and the agents exchange (small) messages over the graph to
solve the problem iteratively.
Convex problem definition

General settings for different applications

e.g. Control system:
 Nodes:
possible states
 xi: actions to take when state i
 Graph: state transitions
 Weights: how much we care about the actions in
neighboring states differing
Convex problem definition


General settings for different applications
The sum-of-norms regularization that we use is like
group lasso, which encourages not just
, for
edge
, but
, consensus across the
edge.
Convex problem definition

Regularization Path
= 0: simply a minimizer of 𝑓𝑖  local computations
 𝜆 → ∞ (𝜆 ≥ 𝜆𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 ):
𝜆
Convex problem definition

Network lasso and clustering
 𝑙2-norms
penalty defines network lasso.
 Cluster size: 𝜆
Convex problem definition

Inference on New Nodes
 we
can interpolate the solution to estimate the value of
𝑥𝑖 on a new node 𝑗.
Proposed solution(ADMM)

Alternating Direction Method of Multipliers(ADMM)
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the
alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3:1–122, 2011.
Proposed solution(ADMM)

ADMM in network lasso
 1).
Introduce a copy of 𝑥𝑖, called 𝑧𝑖𝑗 , at each edge 𝑖𝑗.
Proposed solution(ADMM)

ADMM in network lasso
 2).
Augmented Lagrangian
M. R. Hestenes. Multiplier and gradient methods. Journal of Optimization Theory and Applications,
4:302–320, 1969.
Proposed solution(ADMM)

ADMM in network lasso
 3).
ADMM updates
Proposed solution(ADMM)

Regularization Path
 compute
the regularization path as a function of 𝜆 to
gain insight into the network structure
 Start
at 𝜆 = 𝜆𝑖𝑛𝑖𝑡𝑖𝑎𝑙
 update: 𝜆 ≔ 𝛼𝜆, 𝛼 > 1
 stop:
Non-convex extension


replace the group lasso penalty with a
monotonically nondecreasing concave function 𝜙(𝑢),
where 𝜙 𝑢 = 0 , and whose domain is u𝑢 ≥ 0,
ADMM is not guaranteed to converge, and even if it
does, it need not be to a global optimum
Non-convex extension
Heuristic solution:
to keep track of the iteration which yields the
minimum objective, and to return that as the solution
instead of the most recent step.

Non-convex extension

Non-convex z-Update
 Compared
to the convex case, the only difference in the
ADMM solution is the z-update, which is now
Experiments

1. Network-Enhanced Classification
 We
first analyze a synthetic network in which each
node has a support vector machine (SVM) classifier,
 but does not have enough training data to accurately
estimate it

Idea:
 “borrow”
training examples from their relevant
neighbors to improve their own results
 neighbors with different underlying models has nonzero lasso penalties
Experiments


1. Network-Enhanced Classification
Dataset:
 randomly
generate a dataset containing 1000 nodes,
each with its own classifier, a support vector machine in
R50. Each node tries to predict 𝑦 ∈ −1, 1 , where

Network:
 The
1000 nodes are split into 20 equally-sized groups.
Each group has a common underlying classifier while
different groups have independent models.
Experiments


1. Network-Enhanced Classification
Objective function:
Experiments


1. Network-Enhanced Classification
Results(regularization path):
Experiments


1. Network-Enhanced Classification
Results(prediction accuracy):
Experiments


1. Network-Enhanced Classification
Results(timing):
Convergence comparison
between centralized and
ADMM methods for SVM
problem
Experiments


1. Network-Enhanced Classification
Results(timing):
Convergence time for large-scale 3-regular graph solved at a single
(constant) value of 𝜆
Experiments


2. spatial clustering and regressors
Attempt to estimate the price of homes based on
latitude/longitude data and a set of features.
Experiments


2. spatial clustering and regressors
Dataset:


a list of real estate transactions over a oneweek period in
May 2008 in the Greater Sacramento area.
Network:
build the graph by using the latitude/longitude coordinates
of each house
 connect every remaining house to the five nearest homes
with an edge weight inversely proportional to the distance
between the houses
 785 nodes, 2447 edges, and has a diameter of 61.

Experiments


2. spatial clustering and regressors
Optimization Parameter and Objective Function:
 At
each nodes, solve for
 Objective
function:
Experiments


2. spatial clustering and regressors
Results:
Experiments


2. spatial clustering and regressors
Results:
Conclusion




The network lasso is a useful way of representing
convex optimization problems, and the magnitude of
the improvements in the experiments show that this
approach is worth exploring further, as there are many
potential ideas to build on.
The non-convex method gave comparable performance
to the convex approach, and we leave for future work
the analysis of different non-convex functions 𝜙(𝑢)
we could attempt to iteratively reweigh the edge
weights to attain some desired outcome
Within the ADMM algorithm, there are many ways to
improve speed, performance, and robustness

Questions?
Download