Computational and Physiological Models

advertisement
Computational and Physiological Models
Part 1
Ekaterina Lomakina
Computational Psychiatry Seminar: Computational Neuropharmacology
7 March, 2014
Entropy and the theory of life
“The general struggle for existence of animate beings is
not a struggle for raw materials – these, for organisms, are
air, water and soil, all abundantly available – nor for
energy which exists in plenty in any body in the form of
heat, but a struggle for [negative] entropy, which becomes
available through the transition of energy from the hot
sun to the cold earth.”
Ludwig Boltzmann, 1875
“[...] if I had been catering for them [physicists] alone I should
have let the discussion turn on free energy instead. It is the
more familiar notion in this context. But this highly technical
term seemed linguistically too near to energy for making the
average reader alive to the contrast between the two things.”
Erwin Schrödinger, 1944
Let’s go a bit more formally – 1
• The defining characteristic of biological systems is
that they maintain their states and form in the face
of a constantly changing environment
Minimize entropy
• Mathematically, this means that the probability of
sensory states in which biological agent can be
present must have low entropy.
• Entropy is also the average self information or
Minimize surprise
surprise. Low surprise means that agent is likely to
be in one of the very few states, so observing each of
this states would cause low surprise (both
emotionally and mathematically).
• So to stay ‘alive’ biological agents must therefore
minimize the long-term average surprise to ensure
that their sensory entropy remains low.
Let’s go a bit more formally – 2
• So system must avoid surprise. But how? It does
not know everything what is going to happen…
• System can only evaluate what is available to it: it’s
sensory experience and it’s own properties, i.e.
energy efficiency and robustness.
• That’s where free-energy comes into play.
• Thermodynamical (Helmholtz) free-energy is the
work obtainable from a closed system at a
constant temperature. FE = Energy + Entropy
• Statistical Free-energy is a negative sum of the
predictive accuracy of the model (energy of the
model) and the entropy of the model.
• As we show now free-energy can be seen as an
upper-bound for surprise, which however system
can optimize.
Minimize entropy
Minimize surprise
Minimize free-energy
And a bit more mathematically – 1
Let’s look at the log version Bayes formula, where s(t) are the sensory states and ϑ are
representational parameters of the agent:
Surprise!
If we would have known the true generative model of the sensorium then we
would have been able to compute model evidence of the model or it’s average
surprise.
It is also called sometimes the evidence of the agent’s self existence.
And a bit more mathematically – 2
• However, the true model is almost always unavailable or at least
computationally expensive to compute. Which means that brain is unlikely
to be able to perform such an operation.
• Instead, one can propose simpler model q and optimize it to be as similar
to the true model as possible.
• D is the KL divergence between proposed model q and true model p,
which is always positive.
• The smaller D becomes (the closer q gets to p) the lower becomes F and
the closer F gets to model evidence. Thus F becomes the upper-bound for
surprise.
And a bit more mathematically – 3
• However, we can’t optimize D directly as it requires knowledge of the true
model p. But after some magic…
Expected energy (accuracy)
Entropy (complexity)
• This now can be efficiently optimized, using Variational Bayes
technics.
• They (usually) provide us with fast and efficient update and decision
rules which are likely to be implementable within brain, contrary to
sampling or numerical integration.
And one last bit of formulas
Free energy can be minimized in two
ways:
• By changing the mental representation
(optimizing it such that it better
explains data and becomes more
compact)
• By performing actions which would
reduce potential surprise.
Bayesian brain hypothesis
P(ϑ|x) –
posterior belief
about causes
Probabilistic Causality
Sensations X
Update beliefs
• The Bayesian brain hypothesis uses Bayesian probability theory to
formulate perception as a constructive process based on internal or
generative models, where brain is presented as an inference machine that
actively predicts and explains its sensations.
• Probabilistic model generates predictions, against which sensory samples
are tested to update beliefs about their causes.
• The brain is an inference engine that is trying to optimize probabilistic
representations of what caused its sensory input.
• This optimization can be finessed using a (variational free-energy) bound
on surprise.
Complexity
p(ϑ) – prior beliefs
Accuracy
p(x|ϑ) – generative model (likelihood)
Bayesian brain hypothesis
Two key questions in this hypothesis are:
1. How to choose the form of generative model and the
choice of prior beliefs?
Answer: to use hierarchical models in which the priors
themselves are optimized.
2. How to choose the form of the proposed model or
distribution q?
Answer: it can take take any form driving the choice of
optimization procedure, but the simplest assumption as it
to be Gaussian (Laplace approximation). Then minimizing
free energy simply explains away prediction error.
Bayesian brain hypothesis
• In case of Gaussian
assumption the scheme is
known as predictive coding.
• It is a popular framework for
understanding neuronal
message passing among
different levels of cortical
hierarchies.
• This scheme has been used to
explain many features of
early visual responses and
can plausibly explain
repetition suppression and
mismatch responses in
electrophysiology.
The principle of efficient coding
• The principle of efficient coding suggests that the brain
optimizes the mutual information (that is, the mutual
predictability) between the sensory states and its
internal representation, under constraints on the
efficiency of those representations.
• The infomax principle says that neuronal activity
should encode sensory information in an efficient and
parsimonious fashion
Representing variables
Efficient coding
Sensory states
The principle of efficient coding
• The infomax principle might be presented as a special
case of the free-energy principle, which arises when we
ignore uncertainty in probabilistic representations.
• The infomax principle can be understood in terms of
the decomposition of free energy into complexity and
accuracy: mutual information is optimized when
conditional expectations maximize accuracy (or
minimize prediction error), and efficiency is assured by
minimizing complexity.
Complexity
Accuracy
Coding length
Mutual information
The cell assembly theory
• ‘Cells that fire together wire together’.
• Conditional expectations about states of the
world are encoded by synaptic activity.
• Learning under the free-energy principle is the
the optimization of the connection strengths
in hierarchical models of the sensory states.
• It appears that a gradient descent on free
energy is formally identical to Hebbian
plasticity.
The cell assembly theory
• When the predictions and
prediction errors are highly
correlated, the connection
strength increases, so that
predictions can suppress
prediction errors more
efficiently.
• Synaptic gain of prediction
error units is modulated by
the precision of units.
• The most obvious
candidates for controlling
gain are classical
neuromodulators like
dopamine and acetylcholine.
Neural Darwinism
Epigenetic mechanisms
Primary repertoire of
neuronal connections
Experience-dependent plasticity
•
•
•
•
Secondary repertoire of
neuronal connections
•
This theory focuses on how selection and
reinforcement of action policies is
performed within the boundaries of cell
assembly theory.
Only neuronal assembly which increase
evolutionary value get reinforcement
through the interaction with the
environment (natural selection).
Plasticity is thus modulated through value.
Neuronal value systems reinforce
connections to themselves, thereby
enabling the brain to label a sensory state
as valuable if, and only if, it leads to
another valuable state.
This theory has deep connections with
reinforcement learning and related
approaches in engineering, such as
dynamic programming and temporal
difference models.
Neural Darwinism
• Value is inversely proportional to surprise: the probability of an agent
being in a particular state increases with the value of that state.
• The evolutionary value of an agent is the negative surprise averaged over
all the states it experiences, which is simply its negative entropy.
• Prior expectations (that is, the primary repertoire) can prescribe a small
number of attractive states with innate value. They can also affect the way
world is sampled, i.e. force agent to explore until states with innate value
are founded.
• Neural Darwinism exploits the selective processes in order to explain brain
evolution.
• Free energy formulation considers the optimization of ensemble or
population dynamics in terms of entropy and surprise.
Complexity
Priors on small amount
of innate ‘good’ states +
on exploration
Accuracy
‘Survival’ or
rate change
of value
Optimal control theory
•
•
•
•
•
•
Optimal control theory describes how optimal actions should be selected to
optimize expected cost.
Free energy is an upper bound on expected cost.
According to the principle of optimality cost is the rate of change of value, which
depends on changes in sensory states.
Optimized policy ensures that the next state is the most valuable of the available
states.
Priors specify small amount of fixed-point attractors, and when the states arrive at
the fixed point, value will stop changing and cost will be minimized. Additional
priors on motion through state space enforce exploration until an attractive state is
found.
Action under the free-energy principle is meant to suppress sensory prediction
errors that depend on predicted (expected or desired) movement trajectories.
Complexity
Priors on small amount
of fixed-point attractors
where system should
arrive + on motion
Accuracy
Rate of change
of value
Overview of free-energy principle
• Many global theories of brain function
can be united under a free energy
principle.
• The commonality is that brain
optimizes a (free-energy) bound on
surprise or its complement, value.
• This manifests as perception (so as to
change predictions) or action (so as to
change the sensations that are
predicted).
• Crucially, these predictions depend on
prior expectations (that furnish
policies), which are optimized at
different (somatic and evolutionary)
timescales and define what is valuable.
Dopamine as a prediction error
encoder
• Measures show that dopamine
sensitively react to the reward
itself or to the conditional
stimuli predicting reward, when
such a connection is learnt.
• However dopamine level
decreases in case of lack of
predicted reward.
• That gave rise to the hypothesis
that dopamine encodes
prediction error.
• Optimal control theory might
provide a framework to explain.
Temporal difference algorithm
•
•
•
•
•
The computational goal of
learning is to maximize
expected discounted reward V
(value function proportional to
surprise).
It can be computed dynamically
knowing rewards at the
previous time points.
Prediction error δ (TD error)
can be defined as a linear
combination of reward and
change in surprise.
M1 and M2 are two cortical
modalities which input (as a
derivative of value V(t)) arrives
at VTA. Reward r(t) also
converges on the VTA.
VTA output as a prediction
error is taken as a simple linear
sum informing structures
constructing the prediction.
Representing a stimulus through time
• The ability to predict in
time within the model is
encoded as events
happening at the
different time points
being different stimuli.
• The predicted value is
then modeled as a
weighted linear
combination of the
sensory input.
• The weights are updated
using the information
about the prediction
error.
Simulations
• The conditional stimuli were presented
at time step 10 and time step 20
followed by reward on time step 60.
The prediction of the model.
• Absence of reward on one of the
intermediate trials causes a large
negative fluctuation of the prediction
error.
• However overall model is able to learn
well the dependency between
conditional stimulus and reward, while
also blocking the repetition of
redundant in terms of information
secondary stimulus.
• The behavior of the prediction error
mimics accuratly the measured
dopamine response in monkeys in
similar situation.
Input Perception
Volatility
Probability
Learning
Hierarchical Gaussian filtering
• One particular model within Bayesian
Brain hypothesis is Hierarchical Gaussian
filtering.
• It consists of a hierarchical Bayesian
model of learning through perception.
• The response model can deal with states
and inputs that are discrete or
continuous, uni- or multivariate, and as
well as with deterministic and
probabilistic relationships between
environment and perception.
• Parameters of the model can account for
individual differences between agents and
can be used to simulate and explain
maladaptive behavior.
Hierarchical Gaussian filtering
•
•
•
•
•
This is the simplest example
(with univariate, binary
deterministic response
model).
Each layer of the model
performs Gaussian random
walk with parameters guiding
layer coupling and width of
the Gaussian walks.
As many layers as needed can
be added on top.
Priors on the parameters χ =
{κ, ω, ϑ} allow to perform full
Bayesian inference.
Inverting this model
corresponds to optimizing the
posterior densities over the
unknown (hidden) states x =
{x1, x2, x3} and parameters χ.
This corresponds to perceptual
inference and learning,
respectively.
Inversion under free energy principle
• Exact inference of such a
model would involve
expensive numerical
optimization.
• Instead was used
Variational Bayesian
approach (which involves
minimization of free
energy).
• The key assumption which
was made is a factorization
of the recognition
distribution q (mean-field
approximation)
Update equations
The resulting update equations are not only efficient and easy to compute but also
resemble results from the field of reinforcement learning.
Rescorla-Wagner model: prediction(k) - prediction(k-1) = learning rate x prediction error
Precision updates
• Variances σ1, σ2 and σ3 are also updated on every step
in form of the precision (inverse variance)
• The precision updates account for two type of
uncertainty ‘informational’ (the lack of knowledge
about x2) and ‘environmental’ (the volatility on the
third level).
• It has been proposed that dopamine might encode the
value of prediction error, i.e., the precision-weighting
of prediction errors. This is encoded by parameters κ
and ω.
Meaning and the effect of the
parameters
Reference scenario: ϑ = 0.5, ω = −2.2, κ = 1.4
Meaning and the effect of the
parameters
Reduced ϑ = 0.05 (unchanged ω = −2.2, κ = 1.4)
Meaning and the effect of the
parameters
Reduced ω = −4 (unchanged ϑ = 0.5, κ = 1.4)
Meaning and the effect of the
parameters
Reduced κ = 0.2 (unchanged ϑ = 0.5, ω = −2.2)
Potential for real data
Conventional observed behavioural
data fail to detect difference
between subjects
However computational model reveal
striking difference of the underlying
mechanisms
fraction of correct responses
success
1 [running average]
0.5
0
20
40
60
80
100
120
140
160
trial-wise reward
score
10 [running average]
5
0
RT [ms]
20
20
40
60
80
100
140
160
reaction time [s]
2
10
0
120
20
40
60
80
trials
100
120
4
140
6
160
S_3364P
S_2978P
S_2947P
S_3040P
S_2411T
S_2855P
S_1021T
S_3232P
S_3515P
S_3327P
S_3504P
S_2862P
S_3031P
1
0.8 healthy participant
0
50
4 volatility
2
1
0
-2
0.8
-4
0
50
4
1
2 probability
0
0.5
-2
-4
0
0
50
trials
1
1
0.5 prodromal schizophrenia
0.8
0
50
0
4
50
20 volatility
1
0
-2
0.8
-40
50
0
50
4
1 probability
2
0
0.5
-2
-4
00
50
0
50
trials
1
100
150
100
150
100
150
100
100
150
150
100
100
150
150
100
100
150
150
100
150
0.5
0
0
50
Outcome
• Using free-energy principle we derive fast &
efficient update equations which can be
implemented by brain.
• The resulting update equations present a clear
connection to the field of reinforcement learning.
• Parameterization of these equations accounts for
individual differences and can model the whole
variety of maladaptive behavior.
• There is evidence that some of the parameters
may correspond to neuromodulators, in
particular dopamine.
Conclusions
• Computational models inspired by expert knowledge about the
brain can provide us with powerful insights how brain works in
healthy and maladaptive ways.
• Free-energy principle provides a powerful framework, which
generalizes many of the existent theories about the brain
functioning.
• However, “all models are wrong but some are useful”. To prove the
correctness of model we have to see whether it predicts real data –
more next week.
• Neuromodulators can be predicted to explain certain parameters of
the models. I.e. dopamine seems to play key role in the processes
of reward-driven learning. Studies with pharmacological
manipulations together with computational models can provide a
deeper mechanistical insight on the particular role of it.
Download