Continuation and Symmetry Breaking Bifurcation of the Information Distortion Function September 19, 2002

advertisement
Continuation and
Symmetry Breaking Bifurcation
of the Information Distortion Function
September 19, 2002
Albert E. Parker
Complex Biological Systems
Department of Mathematical Sciences
Center for Computational Biology
Montana State University
Collaborators:
Tomas Gedeon
Alexander Dimitrov
John P. Miller
Zane Aldworth
Bryan Roosien
Outline
Our Problem
A Class of Problems
Continuation
Bifurcation with Symmetries
How we can efficiently solve the
Class of Problems
The Neural Coding Problem:
We want to understand the neural code.
We seek an answer to the question:
How does neural activity represent information about environmental stimuli?
“The little fly sitting in the fly’s brain trying to fly the fly”
The mathematical problem
Optimizing the Information Distortion Function
max F(q, )= max (H(Z|Y)+I(X,Z))
q
q
• We are really just interested in max I(X,Z).
q
•
•
environmental
stimuli
.X
Q(Y |X)
neural
responses
Y
q(Z |Y)
neural responses
in N clusters
Z


. : q( z | y ) |  q( z | y )  1, y  Y   n
zZ


•H(Z|Y) is the conditional entropy of Z|Y.
• I(X,Z) is the mutual information between X and Z.
Annealing
At    *  1, we observe bifurcatio n!
max H ( Z | Y )   * I ( X , Z )
q
At   0
max H ( Z | Y )
q
Annealing
At   
max I ( X , Z )
q
At   0
max H ( Z | Y )
q
Application of the method to 4 gaussian blobs
Random clusters
Similar Problems
• Information Bottleneck Method
(Tishby, Pereira, Bialek 2000)
max –I(Y,Z) +  I(X,Z)
q
Used for document classification, gene expression,
neural coding and spectral analysis
• Deterministic Annealing (Rose 1998)
max H(Z|Y) +  D(Y,Z)
q
Clustering Algorithm
• Rate Distortion Theory (Shannon ~1950)
max –I(Y,Z) +  D(Y,Z)
q
The Class of Problems
max F(q, )=max(G(q)+D(q))
q
q

To apply the bifurcation theory, the above problem must satisfy:
• G and D are infinitely differentiable in .
• G is strictly concave.
• G and D must be invariant under relabeling of the classes.
• The hessian of F is block diagonal with N blocks and B=B if
q(z|y)= q(z|y) for every yY.
 B1 0 0 


q F   0  0 
0 0 B 
N 

The Dynamical System
Goal: To efficiently solve  maxq  (G(q) +  D(q)) for each , incremented
in sufficiently small steps, as .
Method: Study the equilibria of the of the flow

 q 



    q , L (q,  ,  ) :  q ,  G(q)   D(q)    y   q( z | y)  1  

yY
 z

 

•
q, L : n K    n K
•
The Jacobian wrt q of the K constraints {zq(z|y)-1} is J=(IK IK … IK).
•
If wT qF(q*,) w < 0 for every wker J, then q*() is a maximizer of .
•
The first equilibrium is q*(0 = 0)  1/N.
Investigating
the Dynamical
System
The Dynamical
System
How:
 Use numerical continuation in a constrained system to choose 
and to choose an initial guess to find the equilibria q*( ).
 Use bifurcation theory with symmetries to understand
bifurcations of the equilibria.
Properties of the Dynamical System
•
 q 
In our dynamical system      q , L (q,  ,  ) 
 
the hessian
 q F
 q , L (q,  ,  )   T
 J
J

0
determines the stability of equilibria and the location of
bifurcation.
Theorem: (q*,*) is a bifurcation of equilibria of  if and only if
q,L(q*,*) is singular.
Theorem: Let (q*,*) be a local solution to . If q,L(q*,*) is
singular, then qF (q*,*) is singular.
Continuation
•
•
•
•
•
A local maximum qk*(k) of  is an equilibrium of the gradient flow .
Initial condition qk+1(0)(k+1(0)) is sought in tangent direction qk, which is found
by solving the matrix system
   qk 
     q , L (qk , k ,  k )
 q , L (qk , k ,  k )
  
  k
The continuation algorithm used to find qk+1*(k+1) is based on Newton’s method.
We cannot use Newton’s method directly, since we have a constrained problem.
Instead, we use an Augmented Lagrangian or an implicit solution method.
(qk 1 ,  k 1 )
*
q
*
qk 1
(qk ,  k )
*
( 0)
k 1
q
(qk 1 ,  k 1 )
( 0)
qk
*
k
 k 1( 0)
( 0)

Bifurcation!
What to do? How to continue after bifurcation?
q*
q* 
1
N

Do not want to get stuck
on a local max  global max

Conceptual Bifurcation Structure
q* (YN|Y)
q* 
1
N
Bifurcations of q*()

Observed Bifurcations for the 4 Blob Problem
Questions …
How to detect bifurcation?
What kinds of bifurcations do we expect? And how many
bifurcating solutions are there?
How to choose a direction to take after bifurcation is detected?
Explain the nice bifurcation structure that is observed
numerically.
1.
2.
Why are there only N-1 bifurcations observed?
Why are there no bifurcations observed after all N classes have
resolved?
Bifurcations with symmetry
To better understand the bifurcation structure, we capitalize on
the symmetries of the optimization function F(q,).
The “obvious” symmetry is that F(q,) is invariant to relabeling
of the N classes of Z
The symmetry group of all permutations on N symbols is SN.
We need to define the action of SN on q and F(q,) …
The Groups
• Let P be the finite group of n ×n “block” permutation matrices which
represents the action of SN on q and F(q,) . For example, if N=3,
0

 IK
0

IK
0
0
0

0   P permutes q(z1|y) with q(z2|y) for every y
I K 
• F(q,) is P -invariant means that for every   P,
F( q,) = F(q,)
• Let  be the finite group of (n+K) × (n+K) block permutation matrices
q 
which represents the action of SN on   and q, L(q,,):
 

 
 : 
 0

 K n

0 
 |   P
  the lagrange multiplier s and constraints are fixed !
I 

K K 

n K
• q, L(q, , ) is -equivariant means that for every  
q 
 q, L(q, , ) = q, L(   ,)
 
 
Bifurcations with symmetry
•
q 
The symmetry of   is measured by its isotropy subgroup

 
 q ,

 q   q 
    |       
     

•
An isotropy subgroup  is a maximal isotropy subgroup of  if there does not
exist an isotropy subgroup  of  such that     .
•
 q*  *
At bifurcation ( ,  ) , the fixed point subspace of q*,* is
 * 
 

Fix ( q* ,* )  w  ker  q , L (q* , * ,  * ) | w  w,    q* ,*

Bifurcations with symmetry
One of the tools we use to describe a bifurcation in the presence of
symmetries is the Equivariant Branching Lemma (Vanderbauwhede and
Cicogna 1980-1).
Idea: The bifurcation structure of local solutions is described by the
isotropy subgroups of  which have dim Fix()=1.
• System:
x  r ( x,.  ), r : m    m
• r(x,) is G-equivariant for some compact Lie Group G
• r (0,0)  0,  x r (0,0)  0
• Fix(G)={0}
• Let H be an isotropy subgroup of G such that
dim Fix (H) = 1.
• Assume  r(0,0) x0  0 for nontrivial x0  Fix (H) (crossing condition).
Then there is a unique smooth solution branch (tx0,(t)) to r = 0 such that
x0  Fix (H) and the isotropy subgroup of each solution is H.
Bifurcations with symmetry
The other tool:
Smoller-Wasserman Theorem (1985-6)
For variational problems where
r ( x,  )   x f ( x,  )
there is a bifurcating solution tangential to Fix(H) for
every maximal isotropy subgroup H, not only those with
dim Fix(H) = 1.
• dim Fix(H) =1 implies that H is a maximal isotropy subgroup
What do the bifurcations look like?
From bifurcation, the Equivariant Branching Lemma shows that the
following solutions emerge:
An equilibria q* is called M-uniform if qF (q*,) has M blocks that are
identical. The M classes of Z corresponding to these M identical
blocks are called unresolved classes. The classes of Z that are not
unresolved are called resolved.
The first equilibria, q*  1/N, is N-uniform.
Theorem: q* is M-uniform if and only if q* is fixed by SM.
What do the bifurcations look like?
Theorem: dim ker qF (q*,)=M with basis vectors {vi}Mi=1
v if  is the i th unresolved class
[vi ]  
0 otherwise
Theorem: dim ker q,L (q*,,)=M-1 with basis vectors
 vi   vM 
    
 0   0 
Point: Since the bifurcating solutions whose existence
is guaranteed by the EBL and the SW Theorem
are tangential to ker q,L (q*,,), then we know the
explicit form of the bifurcating directions.
What do the bifurcations look like?
Assumptions:
•
•
•
Let q* be M-uniform
Call the M identical blocks of qF (q*,): B. Call the other N-M blocks
of qF (q*,): {R}. We assume that B has a single nullvector v and that
R is nonsingular for every .
If M<N, then BR-1 + MIK is nonsingular.
Theorem:
Let (q*,*,*) be a singular point of the flow
 q 
    q , L (q,  ,  )

 

such that q* is M-uniform. Then there exists M bifurcating (M-1)uniform solutions (q*,*,*) + (tuk,0,(t)), where
( M  1)v if  is the k th unresolved class

[uk ]   v
if   k is any other unresolved class
0
otherwise

Example: Some of the bifurcating branches when N=4 are given by
the following isotropy subgroup lattice for S4
 3v 
 
 v
 v
 v
 
0 
S4
 v
 
 3v 
 v
 v
 
0 
S3
S2 S2 S2
 0 
 
 2v 
 v
 v
 
0 
 0 
 
 v
 2v 
 v
 
0 
 0 
 
 v
 v
 2v 
 
0 
S3
S3
S2 S2 S2
 2v 
 
 0 
 v
 v
 
0 
 v
 
 0 
 2v 
 v
 
0 
 v
 
 0 
 v
 2v 
 
0 
1
S2 S2 S2
 2v 
 
 v
 0 
 v
 
0 
 v
 
 2v 
 0 
 v
 
0 
 v
 
 v
 0 
 2v 
 
0 
 v
 
 v
 3v 
 v
 
0 
S3
 v
 
 v
 v
 3v 
 
0 
S2 S2 S2
 2v 
 
 v
 v
 0 
 
 0 
 v
 
 2v 
 v
 0 
 
 0 
 v
 
 v
 2v 
 0 
 
 0 
For the 4 Blob problem:
The isotropy subgroups and bifurcating directions of the
observed bifurcating branches
isotropy group:
bif direction:
S4
S3
S2
(-v,-v,3v,-v,0)T (-v,2v,0,-v,0)T
1
(-v,0,0,v,0)T
… No more bifs!
Are there other branches?
The Smoller-Wasserman Theorem shows that (under
the same assumptions as before)
if M is composite, then there exists bifurcating
solutions with isotropy group <p> for every
element  of order M in  and every prime p|M.
Furthermore,
dim (Fix <p>)=p-1
We have never numerically
observed solutions fixed by
<p> and so perhaps
they are unstable.
Bifurcating branches from a 4-uniform solution are given by the
following isotropy subgroup lattice for S4
S4
Fix ( (1234 ) )  0
A4 Fix ( A4 )  0
 (1324) 
 12, 34 
 v 
 
 v 
 v
 
 v
 
 13, 24 
 14, 23 
 v 
 
 v
 v
 
 v 
 
 v 
 
 v
 v 
 
 v
 
Maximal isotropy subgroup for S4
S4
S3
S3
S3
S3
 12, 34 
A4
 13, 24 
 14, 23 
Issues: SM
• The full lattice of subgroups of the group SM is
not known for arbitrary M.
Bifurcation Type?
Pitchfork? Conjecture: There are only pitchfork bifurcations.
(show that for ’(t), which depends on 3qqqF(q,), that
’(0)=0 for any M – have this result for M=2,3)
Subcritical or Supercritical?
If not pitchfork, ’(0) >0 or ’(0) <0 answers this question.
If pitchfork, one needs to examine ’’(0), depends on 4qqqqF(q,).
Stability? (Is the bifurcating solution a maximum to ?)
Answered Questions
How to detect bifurcation?
Look for singularity of B.
What kinds of bifurcations do we expect? And how many bifurcating
solutions are there?
M bifurcating (M-1)-uniform solutions
How to choose a direction to take after bifurcation is detected?
((M-1)v, -v, -v, -v, … -v, 0)T
Explain the nice bifurcation structure that is observed numerically.
1.
2.
Why are there only N bifurcations observed.
There are only N different types of M-uniform solutions for M  N.
Why are there no bifurcations observed after all N classes have
resolved.
For 1-uniform solutions, q,L (q*,,) is nonsingular.
The efficient algorithm
Let q0 be the maximizer of maxq G(q), 0 =1 and s > 0. For k  0, let (qk , k
) be a solution to maxq G(q) +  D(q ). Iterate the following steps until
K =  max for some K.
   qk 

1. Perform  -step: solve  q , L (qk , k ,  k )
        q , L (qk , k ,  k )
  k
   qk 
for     and select  k+1 = k + dk where


k

dk = s /(||qk ||2 + ||k ||2 +1)1/2.
2. The initial guess for qk+1 at  k+1 is qk+1(0) = qk + dk  qk .
3. Optimization: solve maxq G(q) +  k+1 D(q) to get the maximizer qk+1 ,
using initial guess qk+1(0) .
4. Check for bifurcation: compare the sign of the determinant of an identical
block of each of q [G(qk) +  k D(qk)] and q [G(qk+1) +  k+1 D(qk+1)]. If a
bifurcation is detected, then set qk+1(0) = qk + d_k u where u is in Fix(H)
and repeat step 3.
Download