International Research Training Group (IRTG) “The Brain in Action” I am therefore I think Karl Friston, University College London Abstract: TThis overview of the free energy principle offers an account of embodied exchange with the world that associates conscious operations with actively inferring the causes of our sensations. Its agenda is to link formal (mathematical) descriptions of dynamical systems to a description of perception in terms of beliefs and goals. The argument has two parts: the first calls on the lawful dynamics of any (weakly mixing) ergodic system– from a single cell organism to a human brain. These lawful dynamics suggest that (internal) states can be interpreted as modelling or predicting the (external) causes of sensory fluctuations. In other words, if a system exists, its internal states must encode probabilistic beliefs about external states. Heuristically, this means that if I exist (am) then I must have beliefs (think). The second part of the argument is that the only tenable beliefs I can entertain about myself are that I exist. This may seem rather obvious; however, if we associate existing with ergodicity, then (ergodic) systems that exist by predicting external states can only possess prior beliefs that their environment is predictable. It transpires that this is equivalent to believing that the world – and the way it is sampled – will resolve uncertainty about the causes of sensations. We will conclude by looking at the epistemic behavior that emerges under these beliefs, using simulations of active inference. Key words: active inference ∙ autopoiesis ∙ cognitive ∙ dynamics ∙ free energy ∙ epistemic value ∙ self-organization . . What does it mean to be embodied? The statistics of life Markov blankets and ergodic systems simulations of a primordial soup The anatomy of inference graphical models and predictive coding canonical microcircuits Action and perception action and its observation simulations of saccadic searches “How can the events in space and time which take place within the spatial boundary of a living organism be accounted for by physics and chemistry?” (Erwin Schrödinger 1943) The Markov blanket as a statistical boundary (parents, children and parents of children) Internal states External states Sensory states Active states The Markov blanket in biotic systems Sensory states s f s ( , s, a) s f ( , s, a) External states f ( s , a, ) a f a (s, a, ) Active states Internal states lemma: any (ergodic random) dynamical system (m) that possesses a Markov blanket will appear to engage in active inference x f ( x) p ( x | m) The Fokker-Planck equation p( x | m) ( f ) p And its solution in terms of curl-free and divergence-free components p( x | m) 0 f ( x) ( Q) ln p( x | m) But what about the Markov blanket? s ( s ,a , ) ln p ( s | m) f ( s ) ( Q) ln p( s | m) Perception f a ( s ) ( Q)a ln p( s | m) Action Value Reinforcement learning, optimal control and expected utility theory F ln p ( s | m) Surprise Infomax, minimum redundancy and the free-energy principle Et [ ln p ( s | m)] Entropy Self-organisation, synergetics and homoeostasis p ( s | m) Pavlov Barlow Haken Model evidence Bayesian brain, evidence accumulation and predictive coding Helmholtz Overview The statistics of life Markov blankets and ergodic systems simulations of a primordial soup The anatomy of inference graphical models and predictive coding canonical microcircuits Action and perception action and its observation simulations of saccadic searches Position Simulations of a (prebiotic) primordial soup Short-range forces Strong repulsion Weak electrochemical attraction Finding the (principal) Markov blanket Markov blanket matrix encodes the children, parents and parents of children B A AT AT A Markov Blanket = [B · [eig(B) > τ]] Adjacency matrix Markov Blanket 20 Hidden states 40 60 80 Sensory states 100 Active states Internal states 120 20 40 60 80 Element 100 120 Does action maintain the structural and functional integrity of the Markov blanket (autopoiesis) ? Do internal states appear to infer the hidden causes of sensory states (active inference) ? Active lesion Markov blanket 8 8 6 6 4 4 2 2 0 0 -2 -2 -4 -4 -6 -6 -8 -8 -6 -4 -2 0 2 4 6 8 -8 -8 -6 -4 -2 Position 0 2 4 6 8 Position Autopoiesis, oscillator death and simulated brain lesions Sensory lesion Internal lesion 8 8 6 6 4 4 2 2 0 0 -2 -2 -4 -4 -6 -6 -8 -8 -6 -4 -2 0 Position 2 4 6 8 -8 -8 -6 -4 -2 0 Position 2 4 6 8 Decoding through the Markov blanket and simulated brain activation True and predicted motion 8 Motion of external state 0 Predictability 6 Christiaan Huygens -0.1 -0.2 -0.3 -0.4 100 2 200 0 -2 -6 -8 300 Time 400 500 Internal states -4 5 -5 0 Position 5 10 Modes Position 4 15 20 25 30 100 200 300 Time 400 500 The existence of a Markov blanket necessarily implies a partition of states into internal states, their Markov blanket (sensory and active states) and external or hidden states. Because active states change – but are not changed by – external states they minimize the entropy of internal states and their Markov blanket. This means action will appear to maintain the structural and functional integrity of the Markov blanket (autopoiesis). Internal states appear to infer the hidden causes of sensory states (by maximizing Bayesian evidence) and influence those causes though action (active inference) Overview The statistics of life Markov blankets and ergodic systems simulations of a primordial soup The anatomy of inference graphical models and predictive coding canonical microcircuits Action and perception action and it observation simulations of saccadic searches “Objects are always imagined as being present in the field of vision as would have to be there in order to produce the same impression on the nervous mechanism” - von Helmholtz Hermann von Helmholtz Richard Gregory Geoffrey Hinton The Helmholtz machine and the Bayesian brain Thomas Bayes Richard Feynman “Objects are always imagined as being present in the field of vision as would have to be there in order to produce the same impression on the nervous mechanism” - von Helmholtz Hermann von Helmholtz Richard Gregory Impressions on the Markov blanket… sS Bayesian filtering and predictive coding f ( s ) (Q )F ( s , ) D prediction update s g ( ) prediction error Making our own sensations sensations – predictions Prediction error Action Perception Changing sensations Changing predictions Hierarchical generative models what A simple hierarchy (3) v(3) v Ascending prediction errors the (2) v (2) x(2) x x(2) (2) v(2) v v(1) (1) x(1) x x(1) (1) v(1) v v(0) where Descending predictions D Sensory fluctuations David Mumford Predictive coding with reflexes Action a s (1) oculomotor signals reflex arc proprioceptive input pons Perception retinal input frontal eye fields Prediction error (superficial pyramidal cells) geniculate (i ) Top-down or backward predictions Bottom-up or forward prediction error (i ) (i 1) g (i ) ( (i ) ) Expectations (deep pyramidal cells) visual cortex (i ) (i ) D (i ) (i ) (i ) (i ) Biological agents minimize their average surprise (entropy) They minimize surprise by suppressing prediction error Prediction error can be reduced by changing predictions (perception) Prediction error can be reduced by changing sensations (action) Perception entails recurrent message passing to optimize predictions Action makes predictions come true (and minimizes surprise) Overview The statistics of life Markov blankets and ergodic systems simulations of a primordial soup The anatomy of inference graphical models and predictive coding canonical microcircuits Action and perception action and its observation simulations of saccadic searches Action with point attractors x(1) v(2) v(1) visual input V s J (0, 0) Descending proprioceptive predictions x1 J1 v(1) (1) v ( s ( a ) g ( )) proprioceptive input (1) v x1 s x2 a a s v(1) x2 J2 V (v1 , v2 , v3 ) Action with itinerant attractors Heteroclinic cycle (central pattern generator) x(1) x(2) Descending proprioceptive predictions action observation 0.4 position (y) 0.6 0.8 1 1.2 1.4 0 a a s v(1) 0.2 0.4 0.6 0.8 position (x) 1 1.2 1.4 0 0.2 0.4 0.6 0.8 position (x) 1 1.2 1.4 Overview The statistics of life Markov blankets and ergodic systems simulations of a primordial soup The anatomy of inference graphical models and predictive coding canonical microcircuits Action and perception action and its observation simulations of saccadic searches F ( s, ) Eq [ ln p( s | , m) ln p( | u ) ln p(u | m) ln q( , u)] Likelihood Free energy Empirical priors Entropy Prior beliefs Energy “I am [ergodic] therefore I think [I will minimise free energy]” ln p u | m Eq ( s , |u ) [ln p( s | ) ln p( | u ) ln q( | u )] Expected energy Expected entropy Eq ( s |u ) [ln p ( s | m)] Eq ( s |u ) [ D[q ( | s , u ) || q( | u )]] Extrinsic value Epistemic value or information gain Bayesian surprise and Infomax KL or risk-sensitive control Expected utility theory In the absence of prior beliefs about outcomes: In the absence of ambiguity: In the absence of uncertainty or risk: Eq [ln p( | u ) ln q( | u )] Eq [ln P( s | m)] Eq [ D[q( | s , u ) || q( | u )]] Bayesian surprise D[q( , s | u ) || q( | u )Q( s | u )] Predicted mutual information D[q( | u ) || p( | u )] Predicted divergence Extrinsic value F ( s, ) Eq [ ln p( s | , m) ln p( | u ) ln p(u | m) ln q( , u)] Free energy Likelihood Empirical priors Prior beliefs Entropy Energy “I am [ergodic] therefore I think [I will minimise free energy]” ln p u | m Eq ( s |u ) [ D[q( | s , u ) || q( | u )]] (u) Epistemic value or information gain (t ) s (t ) S 𝜎(𝑢 Extrinsic value stimulus visual input salience Sampling the world to resolve uncertainty sampling x, p Parietal (where) Frontal eye fields v xp x v Visual cortex v ,q sq x Pulvinar salience map x ,q Fusiform (what) (u ) xp a oculomotor reflex arc v, p sp Superior colliculus Saccadic eye movements Saccadic fixation and salience maps Action (EOG) 2 Hidden (oculomotor) states 0 -2 200 400 600 800 time (ms) 1000 1200 1400 Visual samples Posterior belief 5 Conditional expectations about hidden (visual) states vs. 0 -5 200 And corresponding percept 400 600 800 time (ms) 1000 1200 1400 vs. Hermann von Helmholtz “Each movement we make by which we alter the appearance of objects should be thought of as an experiment designed to test whether we have understood correctly the invariant relations of the phenomena before us, that is, their existence in definite spatial relations.” ‘The Facts of Perception’ (1878) in The Selected Writings of Hermann von Helmholtz, Ed. R. Karl, Middletown: Wesleyan University Press, 1971 p. 384 Thank you And thanks to collaborators: And colleagues: Rick Adams Ryszard Auksztulewicz Andre Bastos Sven Bestmann Harriet Brown Jean Daunizeau Mark Edwards Chris Frith Thomas FitzGerald Xiaosi Gu Stefan Kiebel James Kilner Christoph Mathys Jérémie Mattout Rosalyn Moran Dimitri Ognibene Sasha Ondobaka Will Penny Giovanni Pezzulo Lisa Quattrocki Knight Francesco Rigoli Klaas Stephan Philipp Schwartenbeck Micah Allen Felix Blankenburg Andy Clark Peter Dayan Ray Dolan Allan Hobson Paul Fletcher Pascal Fries Geoffrey Hinton James Hopkins Jakob Hohwy Mateus Joffily Henry Kennedy Simon McGregor Read Montague Tobias Nolte Anil Seth Mark Solms Paul Verschure And many others Time-scale Free-energy minimisation leading to… 10 3 s Perception and Action: The optimisation of neuronal and neuromuscular activity to suppress prediction errors (or freeenergy) based on generative models of sensory data. 100 s 103 s 106 s 1015 s Learning and attention: The optimisation of synaptic gain and efficacy over seconds to hours, to encode the precisions of prediction errors and causal structure in the sensorium. This entails suppression of free-energy over time. Neurodevelopment: Model optimisation through activitydependent pruning and maintenance of neuronal connections that are specified epigenetically Evolution: Optimisation of the average free-energy (free-fitness) over time and individuals of a given class (e.g., conspecifics) by selective pressure on the epigenetic specification of their generative models. Searching to test hypotheses – life as an efficient experiment H ( S , ) H ( S | m) H ( | S ) Et [ ln p( s (t ) | m)] Et [ H ( | S s (t ))] Free energy principle minimise uncertainty (t ) arg min{H [ q( | , )]}