A Class of Problems We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization problems of the form max F(q,)=max (G(q)+D(q)). q q The goal is to solve for = B(0,), where: • . : q( z | y ) | q( z | y ) 1, y Y n zZ • • • G and D are infinitely differentiable in interior of . G has a known local maximum G and D must be invariant under relabeling of the classes. Problems in this Class • Deterministic Annealing (Rose 1998) max H(Z|Y) - D(Y,Z) Clustering Algorithm • Rate Distortion Theory (Shannon ~1950) max –I(Y,Z) - D(Y,Z) Optimal Source Coding • Information Distortion (Dimitrov and Miller2001) max H(Z|Y) + I(X,Z) Used in neural coding. • Information Bottleneck Method (Tishby, Pereira, Bialek 2000) max –I(Y,Z) + I(X,Z) Used for document classification, gene expression, neural coding and spectral analysis Rate Distortion How well is the source X represented by Z? p(X) X Z is a representation of X using N symbols (or clusters) Information Distortion A good communication system has p(X,Y) like: 2H(Y) output sequences 1 Y 2 2I(X,Y) distinguishable input/output classes of (x,y) pairs 3 4 Size of an input/output class: 2H(X) X input sequences input source X 2(H(X|Y) + H(Y|X)) pairs clustered outputs output source P(Y |X) Y q*(Z |Y) Z Q*(Z |X) Goal: Determine the input/output classes of (x,y) pairs. Idea: We seek to quantize (X,Y) into clusters which correspond with the input/output classes. Method: We determine a quantizer, Q*, between X and Z , a representation of Y using N elements, such that the cost function F(Q*,B) is a maximum for some B (0,). Some nice properties of the problem The feasible region , a product of simplices, is nice. Lemma is the convex hull of vertices (). y1 When D is convex, the optimal quantizer q* is DETERMINISTIC. y2 y3 y1 y2 y3 Theorem The extrema of lie generically on the vertices of .. Corollary The optimal quantizer is invariant to small perturbations in the model. Solution of the problem when p(X,Y):= 4 gaussian blobs p(X,Y) I(X,Z) vs. N The Dynamical System Goal: To efficiently solve maxq (G(q) + D(q)) for each , incremented in sufficiently small steps, as B. Method: Study the equilibria of the of the flow q q , L (q, , ) : q , G(q) D(q) y q( z | y) 1 yY z • The Jacobian wrt q of the K constraints {zq(z|y)-1} is J = (IK IK … IK). • The equilibrium at =0 is q*(0) 1/N. • q F q., L (q, , ) T J J 0 determines stability and location of bifurcation. Assumptions: • Let q* be a local solution to and fixed by SM . • Call the M identical blocks of q F (q*,): B. Call the other N-M blocks of q F (q*,): {R}. • At a singularity (q*,*,*), B has a single nullvector v and R is nonsingular for every . • If M<N, then BR-1 + MIK is nonsingular. Theorem: If (q*,*,*) is a bifurcation of equilibria of , then * 1. For the four Blob Problem when N >2, the first bifurcation is subcritical (a first order phase transition): Investigating the Dynamical System How: Use numerical continuation in a constrained system to choose and to choose an initial guess to find the equilibria q*( ). Use bifurcation theory with symmetries to understand bifurcations of the equilibria. Continuation (qk 1 , k 1 ) * q * qk 1 (qk , k ) * ( 0) k 1 q (qk 1 , k 1 ) ( 0) qk * k k 1( 0) ( 0) • A local maximum qk*(k) of is an equilibrium of the gradient flow . • Initial condition qk+1(0)(k+1(0)) is sought in the tangent direction qk , which is found by solving the matrix system qk q , L (qk , k , k ) q , L (qk , k , k ) k • The continuation algorithm used to find qk+1*(k+1) is based on Newton’s method. Conceptual Bifurcation Structure q* (YN|Y) q* 1 N Bifurcations of q*() Observed Bifurcations for the 4 Blob Problem Bifurcations with symmetry To better understand the bifurcation structure, we use the symmetries of the cost function F(q,). The symmetry is that F(q,) is invariant to relabeling of the N classes of Z The symmetry group of all permutations on N symbols is SN. q The action of SN on and q, L (q, , ) is represented by the finite Lie Group : 0 K n 0 n K | P I K K where P is a “block permutation” matrix. q The symmetry of is measured by its isotropy group, the subgroup of which fixes it. What do the bifurcations look like? The Equivariant Branching Lemma gives the existence of bifurcating solutions for every isotropy subgroup which fixes a one dimensional subspace of ker q,L (q*,,). Theorem: Let (q*,*,*) be a singular point of the flow q q , L (q, , ) such that q* is fixed by SM. Then there exists M bifurcating solutions, (q*,*,*) + (tuk,0,(t)), each with isotropy group SM-1, where ( M 1)v if is the k th unresolved class [uk ] v if k is any other unresolved class 0 otherwise and v is a nullvector of an unresolved block of the Hessian. Bifurcation Structure Let T(q*,*) = 3 uk , 3L[uk , PL L PL 3L[uk , uk ] ( M 2 3M 3) 4 F [uk , uk , uk ] Pitchform Like Bifurcations. Theorem: All bifurcations “pitchfork like”. Branch Orientation? Theorem: If T(q*,*) > 0, then the branch is supercritical. If T(q*,*) < 0, then the branch is subcritical. Branch Stability? Theorem: If T(q*,*) < 0, then all branches fixed by SM-1 are unstable. Partial lattice of the isotropy subgroups of S4 (and associated bifurcating directions) S4 3v v v v 0 v 3v v v 0 S3 S2 S2 S2 0 2v v v 0 0 v 2v v 0 0 v v 2v 0 S3 S3 S2 S2 S2 2v 0 v v 0 v 0 2v v 0 v 0 v 2v 0 v v 3v v 0 S2 S2 S2 2v v 0 v 0 v 2v 0 v 0 v v 0 2v 0 S3 v v v 3v 0 S2 S2 S2 2v v v 0 0 v 2v v 0 0 v v 2v 0 0 1 For the 4 blob problem: The isotropy subgroups and bifurcating directions of the observed bifurcating branches isotropy group: S4 S3 S2 1 bif direction: (-v,-v,3v,-v,0)T (-v,2v,0,-v,0)T (-v,0,0,v,0)T …No more bifs! Other Branches The Smoller-Wasserman Theorem ascertains the existence of bifurcating branches for every maximal isotropy subgroup. Theorem: If M is a composite number, then there exists bifurcating solutions with isotropy group <p> for every element of order M in and every prime p|M. The bifurcating direction is in the p-1 dimensional subspace of ker q,L (q*,,) which is fixed by <p>. Lattice of the maximal isotropy subgroups <p> in S4 S4 (1423) A4 (1324) 1234 2 1324 2 v v v v 2 1243 v v v v v v v v The above theorem states that there are bifurcating solutions from q1/4 with symmetry <(1234)2>, <(1243)2>, <(1324)2>. The full lattice of subgroups of the group SM is not known for arbitrary M. A numerical algorithm to solve max F(q, ) Let q0 be the maximizer of maxq G(q), 0 =1 and s > 0. For k 0, let (qk , k ) be a solution to maxq (G(q) + D(q )). Iterate the following steps until K = B for some K. qk q , L (qk , k , k ) 1. Perform -step: solve q , L (qk , k , k ) k qk and select k+1 = k + dk where for k dk = s /(||qk ||2 + ||k ||2 +1)1/2. 2. The initial guess for qk+1 at k+1 is qk+1(0) = qk + dk qk . 3. Optimization: solve maxq (G(q) + k+1 D(q)) to get the maximizer q*k+1 , using initial guess qk+1(0) . 4. Check for bifurcation: compare the sign of the determinant of an identical block of each of q [G(qk) + k D(qk)] and q [G(qk+1) + k+1 D(qk+1)]. If a bifurcation is detected, then set qk+1(0) = qk + dk u where u is given by and repeat step 3.