Application of Numerical Algebraic Geometry to Geometric Data Analysis Daniel Bates Brent Davis

advertisement
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Application of Numerical Algebraic Geometry to
Geometric Data Analysis
Daniel Bates 1 Brent Davis 1 Chris Peterson
Michael Kirby 1 Justin Marks 2
1 Colorado
2 Air
1
State University - Fort Collins, CO
Force Institute of Technology - Wright Patterson Air Force Base, OH
August 2, 2013
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Definitions
Definition: Grassmann manifold
Let the Grassmann manifold Gr(n, p) denote the set of all
p-dimensional subspaces of Rn .
Definition: Elements in Gr(n, p)
A point [M ] ∈ Gr(n, p) is ∼
= to an equivilence class of full rank
n × p orthonormal matrices that have the same column space as
M.
Definition: Principal Angles
Two subspaces [X] and [Y ] of Rn have p principal angles
θ1 ≤ θ2 ≤ · · · θp ≤ π2 where p = min{dim [X], dim [Y ]}.
The principal angles between [X] and [Y ] are the inverse
cosine of the singular values of the matrix X T Y .
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Examples
Example: Angles between lines
Consider the x-axis and y-axis in Gr(2, 1) represented by the
unit vectors eT1 = (1, 0)T and eT2 = (0, 1)T respectively.
(1, 0)T (0, 1) = 0.
SVD(0) = 0 · 0 · 0T
θ1 = cos−1 (0) =
π
2
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Examples
Example: Angle between hyperplanes in R3
Consider two matrices




0.218217
0.975900
−0.666666 0.741890
X ≈ −0.872871 0.195180  , Y ≈  0.333333 0.382911
0.436435 −0.097590
0.666666 0.550434
θ1 ≈ 0
θ2 ≈ 1.506535
In fact for any two distinct [X], [Y ] ⊂ Gr(3, 2):
θ1 = 0 since the two subspaces intersect in a line.
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Big Picture and Problems
Fundamental Problem in Geometric Data Analysis
Given a cluster of points {[Y1 ], . .. , [Yk ]} in some Grassmann
manifold(s) how do we assign a mean representative to a cluster
of points using principal angles?
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Big Picture and Problems
Problems
1
The cluster of points might be in
Gr(n, 1) ⊕ Gr(n, 2) ⊕ · · · ⊕ Gr(n, n − 1)
a disjoint union of Grassmann manifolds of various
dimensions.
2
Geometric assumptions need to be made so local
gradient-based methods work properly.
3
There could be more than one mean representative and even
an infinite number of them.
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Section 2
Problem Statement
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Problem Statement
We
address these issues using amean representative based on the
cosine of the principal angles.
Problem Statement
{V1 , . . . , Vk } : subspaces of Rn .
Yi : fixed n × di orthonormal matrices such that Vi = [Yi ].
L : one dimensional subspace of Rn
θ(L, Vi ) : the principal angle between L and Vi .
Find L such that it maximizes the function:
F (L) =
k
X
cos θ(L, Vi )
i=1
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Geometric Examples
Example 1
Consider the three standard coordinate planes in R3 .
The
special structure
of the coordinate axes will produce
four
distinct
lines
L1 , . . . , L4 to all give the same optimal
solution.
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Geometric Examples
Example 2
Consider three randomly chosen planes in R3 .
The
generic behavior
of three hyperplanes will produce
one distinct line L that gives the optimal solution.
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Geometric Examples
Example 3
Consider the xy-plane and the z-axis in R3
The line L is non-unique. There is an entire cone of lines L.
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Section 3
Reformulating the Problem
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Step 1
Main Idea
The problem can be reformulated as a completely algebraic
optimization
problem.
We break the reformulation into
two main parts.
Step 1
L = span of some unit length vector `.
cos θ(L, Vi ) is the singular value of `T Yi .
`T Yi is the length of the projection of ` onto Vi .
projVi ` is a vector which makes the smallest angles with ` that
is in Vi .
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Step 1
Step 1 (continued)
Therefore,
max
L
k
X
cos θ(L, Vi )
i=1
subject to L ∈ Rn is a one-dimensional vector space
is equivilent to
max
`
k
X
kprojVi `k
i=1
T
subject to ` ` = 1.
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Step 1
Step 1 (continued)
Since kprojVi `k = cos θ(L, Vi ) = `T vi for some unit length vector
vi ∈ Vi the problem:
k
X
max
`
kprojVi `k
i=1
T
subject to ` ` = 1.
is equivilent to finding an ` to the optimization problem:
max
l,vi
k
X
`T vi
i=1
T
subject to ` ` = 1, viT vi = 1 and vi ∈ Vi
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Step 2
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
We need to introduce numerical data for Vi which is given in the
form of orthonormal matrices Yi such that Vi = [Yi ].
Step 2: Introduce data Yi
vi ∈ [Yi ] ⇒ vi = Yi αi for some coefficient vector αi .
viT vi = αiT YiT Yi αi = αiT αi = 1 since Yi orthonormal
Now we have:
max
`,αi
`T
k
X
Yi αi
i=1
subject to `T ` = 1,
αiT αi = 1 for 1 ≤ i ≤ k.
after factoring out `T .
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Step 2
Key Fact: The unit length vector ` can be chosen
independently of
the choice of αi .
P
To Maximize: ` should point in the direction of ki=1 Yi αi .
Reformulation 2
Find the αi0 s that optimizes
max
αi
subject to
then set v =
k
P
2
k
X
Yi α i i=1
T
αi αi
=1
Yi αi and recover ` = v/kvk and produce L.
i=1
We call L the max-length-vector-line of best fit to a collection of
subspaces {V1 , . . . , Vk }.
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Geometry of max-length-vector-line of best fit
Geometry behind max-length-vector-line of best fit
Red vectors rotate in their subspaces
Black vector represents the max-length-vector-line of best fit
Maximize the length of the black vector(s).
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Section 4
Karush-Kuhn-Tucker
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
KKT conditions
The constraints αiT αi = 1 form a compact set.
2
k
X
Y
α
i i continuous ⇒ maximum obtained.
i=1
Local solutions found at Karush-Kuhn-Tucker (KKT) points.
Set αT = (α1T , . . . , αkT ).
The KKT conditions are
2
k
k
X
X
∇α Yi αi +
λi ∇α (αiT αi − 1) = 0
i=1
i=1
αiT αi = 1
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
KKT equations
KKT equations
The KKT polynomial equations can be written compactly as:
[Y1 · · · Yk ] ⊗ [Y1 · · · Yk ] + diag(λd11 , . . . , λdkk ) · α = 0
αiT αi = 1
where
“⊗” denotes the block outer product of the matrices Yi .
diag(λd11 , . . . , λdkk ) is a diagonal matrix with diagonal elements
λ1 , . . . , λk each repeated di times.
Geometrically, we are ‘almost’ solving a symmetric eigenvalue
problem on S d1 ⊕ · · · ⊕ S dk .
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Section 5
Implementation
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Main Algorithm
Main Algorithm
To solve
max
L
1
k
X
cos θ(L, Vi )
i=1
Find all solutions to
[Y1 · · · Yk ] ⊗ [Y1 · · · Yk ] + diag(λd11 , . . . , λkdk ) · α = 0
αiT αi = 1
αi0 s
k
P
compute kvk = Yi αi .
2
Using the
3
Find the largest kvk and set ` =
B. Davis
Numerical Algebraic Geometry and Data Analysis
i=1
v
kvk
to recover L.
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Main Algorithm
Implementing Main Algorithm
1
Use Bertini as a blackbox to approximate all solutions and
find real solutions.
2
Standard MATLAB routines
3
Standard MATLAB routines
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Main Algorithm
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Example
Consider five subspaces {[Y1 ], . . . , [Y5 ]} ⊂ R10 of dimension
4, 3, 3, 2, 2, respectively.
Y1 =
Y3 =

−0.5962

−0.0603
−0.0272

 0.3456

 0.0143

−0.1597

−0.2306

 0.1345

 0.1707
−0.6282

−0.2255
 0.2284

 0.1889

 0.0444

 0.2250

 0.1891

−0.3656

−0.4225

 0.5668
−0.3735
0.3174
−0.3950
−0.4151
−0.2119
0.4002
0.1981
0.1088
−0.1462
0.3994
−0.3660
0.6196
−0.1718
−0.5083
−0.2073
0.2398
0.0588
0.2100
−0.3401
0.1216
−0.2229
0.2941
−0.3179
−0.1017
−0.2427
−0.1767
−0.5490
−0.5871
0.2434
−0.0813
0.0034

−0.2330
−0.0372

−0.2066

0.0153 

−0.1067

−0.7539

−0.1226

−0.5230

−0.0760
0.1653
B. Davis
Numerical Algebraic Geometry and Data Analysis

−0.2875

0.2090 
−0.0415

−0.2809

0.4801 

−0.0530

−0.0340

0.5743 

0.3273 
Y2 =
0.3488
Y4 =

−0.0912
−0.4800

 0.1165

−0.0344

 0.2318

 0.2364

 0.1639

−0.5517

−0.5101
−0.2134

−0.4094

 0.1691
−0.4253

−0.3984

 0.0201

−0.3749

 0.2957

 0.2952

 0.2068
−0.3255

−0.0890
−0.4570

0.1897 

−0.4760

−0.1997

−0.5565

0.0354 

−0.0197

0.3158 
−0.2639
−0.0947
−0.0587
−0.4539
−0.0848
−0.3521
0.1359
0.0132
−0.4415
0.4366
0.4962
Y5 =

0.5107
0.1034 

−0.4816

−0.3607

0.5206 

0.2460 

−0.0927

−0.0481

−0.1546
0.0052

−0.4559
 0.5467

 0.0412

 0.0829

 0.0428

−0.2961

 0.1946

−0.3121

 0.0314
0.5088

−0.2527
0.0930 

−0.0474

−0.4653

−0.2145

0.2359 

−0.4542

0.4063 

−0.3343
0.3521
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Example
Using the formulation:
([Y1 Y2 Y3 Y4 Y5 ] ⊗ [Y1 Y2 Y3 Y4 Y5 ] + diag(λ41 , λ32 , λ33 , λ24 , λ25 )) · α = 0
2
2
2
2
α11
+ α12
+ α13
+ α14
= 1
2
2
2
α21
+ α22
+ α23
= 1
2
2
2
α31
+ α32
+ α33
= 1
2
2
+ α42
= 1
α41
2
2
α51
+ α52
= 1
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Example
Results:
Bertini was capable of finding 172 possible real solutions in terms
of αT = (α1T , α2T , . . . , α5T )
For these paticular matrices,
`=


−0.590544722138823
 0.364851325879266 


 0.239325653839283 


 0.082665256947331 


−0.319780941882010

±
−0.352838340337804


−0.123908926359078


 0.320797230109883 


 0.086312690034322 
0.318686707166268
which was produced by a vector v of length 3.8655.
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Example
Parallel implementation of Bertini v1.3.1
64 2.67 GHz Xeon 5650 compute nodes on the CentOS 6.4
operating system.
Using the regeneration routine, Bertini tracked 14, 866 paths
in approximately 52 seconds .
Among the 14, 866 paths, 3552 of them were successful for
which 172 produces real solutions.
Post-processing of the data to compute v and ` was done in
serial in negligible time.
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Introduction
Problem Statement
Reformulating the Problem
Karush-Kuhn-Tucker
Implementation
Future Work
A few notes and next steps:
The ambient dimension n can be made arbitrarily large.
Find applications outside of math such as image classification
problems.
Relax to a symmetric eigenvalue problem and use parameter
homotopies for faster solving.
Find an entire flag of max-length-vector-lines of best fit
Compute new mean representatives similiar to this method.
Thank you for listening!
B. Davis
Numerical Algebraic Geometry and Data Analysis
SIAM AG13
Download