Neural coding and information geometry Hiroyuki Nakahara RIKEN Brain Science Institute

advertisement
Neural coding and
information geometry
Hiroyuki Nakahara
Lab for Integrated Theoretical Neuroscience
RIKEN Brain Science Institute
Outline
1. Overview
2. Hierarchical model of neural
interactions
Section I
Overview
Introduction
Intelligent behavior (e.g. higher, cognitive, adaptive)
Achieved by collective neural activities
Decision making and reward-oriented behavior
— Neural computations
-- “Computations by collective neural activities”
Introduction
Neural computation – computational neuroscience:
At heart, interactions of neural activities
1015 bits/sec (#synapse x 1 bit/sec)
Terry Sejnowski
Most experimental findings so far
-- mostly by approach of “isolation”
(e.g., lesion, fMRI and single-unit studies)
(Bartels, Zeki 04)
How to go beyond the ‘isolation’ approach?
Introduction
“modeling” – rather abstract form
e.g. Amari-Hopfield model, Boltzmann machine,
gradient disecent (natural ….), em-algorithm, graphical model, etc
“modeling” – neural coding / population coding
Tuning curve (spike counts), their variability
ri = φi ( x; ci ) + ni ( x)
ni ( x)
N (0,σ i2 ( x))
Q: About input x, given the activities r ?
e.g. Fisher information etc
Our past works
* Faithful and unfaithful models -- Wu et al. (’01, ’02a,b, ’04)
* Attention modulation -- Nakahara et al (’01, ’02)
* Singularity in decoding – Amari & Nakahara (’05)
* Neural dynamics in SC – Nakahara et al (’06)
Introduction
“data analysis”
“method of data analysis”
Spike coding (Neural spike as binary variable)
ηi = E[ xi ]
Most research with data so far: 1st-order or 2nd-order model
log Q pair ( X ) = ∑ θ i xi + ∑ θij xi x j −ψ
i
i, j
How can we reliably detect usefulness and meanings of
higher-than-pairwise order interactions of neural activities?
Information geometry on binary random vectors
Information Geometric Approach
Basic needs : Systematic treatment for higher-order interaction
Use good properties of the dual coordinates
log P ( X ) = ∑ θi xi + ∑ θij xi x j + ∑ θijk xi x j xk + K + θ12Kn x1 x2 K xn −ψ
i
ij
ijk
η -coordinates: η = (ηi , ηij , ...) = (η1 ,η2 ,...,ηN ),
ηA = E[ X A ]
θ -coordinates: θ = (θ i , θ ij ,...) = (θ1 ,θ 2 ,...,θ N )
k-cut mixed coord: ς k = (η1 , η2 ,...,ηk ;θ k +1 ,θ k + 2 ,...,θ N ) = (ηk- ;θ k + )
⎡ Aζ k
FI is given as Gζ k = ⎢
⎢⎣ O
O ⎤
⎡ Aη
⎥ , where Gη = ⎢ T
Dζ k ⎥⎦
⎣ Bη
Bη ⎤
,
Dη ⎥⎦
⎡ Aθ
Gθ = ⎢ T
⎣ Bθ
Aζ k = Aθ−1 ,
Pythagorian: D[ P : Q] = D[ P : R] + D[ R : Q] where ς kR = ( ηkP− ; ς Qk + )
Past works
Bθ ⎤
Dθ ⎥⎦
Dζ k = Dη−1
* IG framework for spike analysis -- Nakahara & Amari (‘02)
* Detection of interaction cascade -- Nakahara et al. (‘03)
* Emergent higher-order synchrony -- Amari et al. (‘03)
* Temporal domain — Nakahara et al (’06)
* Others etc….
Section II
“Theory in practice?”
Hierarchical Interaction Structure of Neural
Activities in Cortical Slice Cultures
(Santos, Gireesh, Plenz & Nakahara, J. Neurosci, 2010)
Introduction
• Previous studies suggested that 2nd-order maximum
entropy model (“pairwise model”) adequately explains
activity patterns.
log Q pair ( X ) = ∑ θ i xi + ∑ θij xi x j −ψ
i
i, j
• If generally true, a significant simplication for describing
neural interactions
Schneidman et al., 2006
Other works
(Shlens et al 06, 09, Tang et al 08 etc)
• Certainly, “appropriately simple” models are crucial for
improving our understanding of neural interactions and
thereby functions.
• But is only up to 2nd-order really sufficient for any
underlying network systems?
cf. for computations as well as scalability
• There may be other simplified models that correspond
better to an underlying network interaction structure,
thereby being adequate for large-scale cortical activity
---- what are appropriate units for interactions?
Our dataset (Plenz Lab, NIMH)
(hn note)
“neural avalanches”
power law
• Cultures of coronal slices from rat somatosensory cortex and the VTA
• LFP signal (1-200 Hz), thresholded to obtain negative LFP (nLFP) peaks
• nLFP peaks binned at 4, 10, or 20 ms
Pairwise model results
Pairwise model on a group of
10 electrodes (randomly-sampled)
Good results in 12 datasets
(200 groups / dataset)
The score called Fpair is frequently used :
Fpair = 1 -
(
( Pˆ ( X ) Q
)
( X))
D KL Pˆ ( X ) Q pair ( X )
D KL
ind
Hierarchical Model: Proposal
Our proposal - Define a new functional unit for large-scale activity
- Create a hierarchical model for different spatial scales
Intuition: low resolution of
correlation structure
COR of every electrode
pair may not be needed
Electrodes
as units
Clusters
as units
• How to define cluster activities?
• How to find clusters?
Cluster activity
= magnitude of electrode activity in cluster
Measures of cluster activity:
Linear
Examples
n
lin
I
= ∑ xi
log
I
= ⎢⎣log 2 (1 + cIlin ) ⎥⎦
c
i =1
Log
c
Binary
cIbin = ( cIlin > 0 )
Results shown later, mostly for log cluster activity
Clustering criterion: homogeneity of activity
A backward searching
procedure was used.
(note: caveat)
Hierarchical Model: Formulation
Hierarchical model integrates interactions over different scales
m
c
P ( X ) ≈ Qhier ( X ) = Q pair
(C) ∏
I =1
e
Q pair
( XI )
e
Q pair
( CI )
C = ( c1 , c2 ,K , cm )
• Pairwise models for both local (unit = electrode) & long-range (unit =
cluster) interactions, with conditional independence assumption on
electrode activity across clusters given cluster activity.
Results: clusters
Clusters - example array:
Homogeneous clusters characterized by
strong correlation and electrode proximity
within cluster
across clusters
• Physical distance on array
• Correlation coefficients
Results: 10-electrode
– comparison with pairwise model
One example of a 10 electrode group
Cluster information used
for 10-electrode groups
Example results for 1 group
(Fpair = 0.89, Fhier = 0.93)
Comparison with pairwise model (10-electrode groups)
Hierarchical model has better accuracy than
pairwise model, even with fewer parameters
Summary
(12 datasets, 200 groups / dataset)
Accuracy and # of param
(over 12 datasets)
Note: also confirmed with the results using cross-validation
also confirmed w.r.t. ‘shuffled’ clusters
Results: accuracy on full array (~ 60 electrodes)
Hierarchical model with log clusters
predicts array-wide synchronized states
Summary
Example
(one dataset; for a ‘fraction’)
P(f)
(with log – colored, binary -- gray)
Until here is in Santos et al., J. Neurosci, 2010
Ongoing results:
Getting back to interactions at electrode level
m
c
P ( X ) ≈ Qhier ( X ) = Q pair
(C) ∏
I =1
e
Q pair
( XI )
e
Q pair
( CI )
C = ( c1 , c2 ,K , cm )
Example
Original param.
e
Q pair
(xα ) :{θiα , θijα }
c
Q pair
(c) :{ζ α |a , ζ α ( β )α ′( β )|ab }
θ − coordinates (electrode level)
Q (xα ) :{θ% , θ% , θ% ,..., θ%
}
hier
i
ij
ijk
123...n
(Note: possible to write analytically)
Discussion
• The hierarchical model captures two levels of interactions
using two units: electrodes and clusters.
• The model captures cortical LFP activity patterns better than
the pairwise model. Æ clusters -- underlying cortical layer structure?
Æ Identifying the appropriate units of interaction of a network
may enable the network interactions to be better characterized,
with better pasimony and scalability
• Significant higher-order interactions are embedded in a
specific way. A new hierarchical model further helps us examine
those properties. cf. across/within cluster interactions
Follow-up (Q&A session)
• I make a note here regarding “context specific graphical
model” and/or “a weak definition of conditional independence”,
which I mentioned in Q & A session. Here is the paper I was
referring to; Nakahara et al. (2003) Bioinformatics. (you can also
download it from www.itn.brain.riken.jp). Any comments and
feedbacks will be greatly appreciated.
• Taking advantage of adding this slide after my presentation, I
would like to express my sincere gratitude to all the particiants
in the workshop and the organizers who have all of us get
together.
END
Download