SEMINARS ON META-COGNITION, 2012–2013 November 29th 4:30 – 6:00pm Old Library Karl Friston Meta-cognition, prediction, precision (Discussant, Andreas Roepstorff, Aarhus) Abstract Predictive coding models and the free-energy principle, suggests that cortical activity in sensory brain areas reflects the precision of prediction errors and not just the sensory evidence or prediction errors per se. If we assume that neuronal activity encodes a probabilistic representation of the world that optimizes free-energy in a Bayesian fashion, then, because free-energy bounds surprise or the (negative) log-evidence for internal models of the world, this optimization can be regarded as evidence accumulation or (generalized) predictive coding. Crucially, both predictions about the state of the world generating sensory data and the precision of those data have to be optimized. In other words, we have to make predictions (test hypotheses) about the content of the sensorium and predict our confidence in those hypotheses. I hope to demonstrate the meta-representational aspect of inference using simulations of visual searches and action selection - to illustrate their nature and promote discussion about its role in high-order cognition. The basic idea: active inference and free energy Beliefs about beliefs: beliefs about uncertainty Beliefs about beliefs: beliefs about precision and agency “Objects are always imagined as being present in the field of vision as would have to be there in order to produce the same impression on the nervous mechanism” - von Helmholtz Hermann von Helmholtz Richard Gregory Geoffrey Hinton From the Helmholtz machine to the Bayesian brain and self-organization Thomas Bayes Richard Feynman Hermann Haken temperature What is the difference between a snowflake and a bird? Phase-boundary …a bird can act (to avoid surprises) The basic ingredients Hidden states in the world ω Sensations Internal states of the agent s g ( ψ, a ) ω s Fluctuations Posterior expectations ψ f ( ψ, a ) ω x arg min F ( s , ) External states a arg min a F (s , μ) Action What we need to explain: how do we minimise the dispersion of sensory states (homoeostasis)? ln p(s (t ) | m)dt H [ p(s | m)] The principle of least free energy (minimising surprise) F ( s , , m) ln p( s | m) DKL [q( | ), p( | s )] Eq [ ln p( , s )] H [q( | )] Bayesian inference Maximum entropy principle Ergodic theorem dtF (t ) dt ln p(s (t ) | m) H [ p(s | m)] The principle of least action Self organisation How can we minimize surprise (prediction error)? sensations – predictions Prediction error Change sensations Change predictions Action Perception …action and perception minimise free energy Action as inference – the “Bayesian thermostat” Posterior distribution p( | s) Prior distribution p ( ) Likelihood distribution p( s | ) s 20 40 60 80 100 120 temperature (t ) (t ) a (t ) Perception arg min F ( s, , ) arg min s ( s(a) g ( )) 2 ( ) 2 Action a arg min F ( s, , ) arg min s ( s(a) g ( )) 2 ( ) 2 a a s g ( ) How might the brain minimise free energy (prediction error)? Hidden states in the world Sensations Fluctuations Internal states of the agent Posterior expectations arg min F ( s , ) External states a arg min a F (s , μ) Action …by using predictive coding (and reflexes) Free energy minimisation Generative model Predictive coding with reflexes s g ( x, v , a ) ω v s g (1) ( x (1) , v (1) ) v(1) v(i ) v(i ) v(i ) v(i ) ( v(i 1) g (i ) ( x(i ) , v(i ) )) x f ( x, v , a ) ω x x (1) f (1) ( x (1) , v (1) ) x(1) x(i ) (xi ) x(i ) (xi ) (D x(i ) f (i ) ( x(i ) , v(i ) )) a a F ( s , ) D F (s, ) v ( i 1) g ( i ) ( x ( i ) , v ( i ) ) v( i ) x (i ) f (i ) ( x ( i ) , v ( i ) ) x( i ) v(i ) Dv(i ) v (i ) (i ) v(i 1) x(i ) D x(i ) x (i ) (i ) a ( a v(1) ) v(1) From models to perception A simple hierarchy (3) v(3) v Inward error stream Generative model Dx(i ) f (i ) ( x(i ) , v (i ) ) x(i ) (2) v (2) v (2) x(2) x (2) x (2) x (2) v(2) v (1) v (1) v x(1)x(1) (1) x (1) x (1) v(1) v (0) v (0) v pa (i ) i s v v (i 1) g (i ) ( x(i ) , v (i ) ) v(i ) Outward prediction stream Model inversion (inference) Expectations: Predictions: Prediction errors: (0) v(i ) Dv(i ) v ( i ) ( i ) v( i 1) x(i ) D x(i ) x (i ) (i ) g (i ) g (i ) ( x( i ) , v( i ) ) f (i ) f (i ) ( x( i ) , v( i ) ) v(i ) (vi ) v( i ) (vi ) ( v( i 1) g ( i ) ) x(i ) (xi ) x( i ) (xi ) (D x( i ) f ( i ) ) David Mumford Predictive coding with reflexes Action a a s v(1) oculomotor signals reflex arc proprioceptive input pons Perception retinal input Prediction error (superficial pyramidal cells) occipital cortex Attention geniculate (i ) Top-down or backward predictions v(i ) v(i ) v(i ) (vi ) ( v(i 1) g (i ) ( x(i ) , v(i ) )) x(i ) (xi ) x(i ) (xi ) (D x(i ) f (i ) ( x(i ) , v(i ) )) Conditional predictions (deep pyramidal cells) Bottom-up or forward prediction error visual cortex (i ) v(i ) Dv(i ) v (i ) ( i ) v(i 1) x(i ) D x(i ) x (i ) (i ) Biological agents resist the second law of thermodynamics They must minimize their average surprise (entropy) They minimize surprise by suppressing prediction error (free-energy) Prediction error can be reduced by changing predictions (perception) Prediction error can be reduced by changing sensations (action) Perception entails recurrent message passing in the brain to optimize predictions Action makes predictions come true (and minimizes surprise) Beliefs about beliefs: beliefs about uncertainty Perception as hypothesis testing – action as experiments But how do we think action will change our beliefs? Searching, salience and saccades Where do I expect to look? a (t ) arg min s ( s (a ) g ( )) 2 ( ) 2 a arg min s ( s ( a ) g ( )) 2 ( ) 2 a arg min ? s g ( ) Sampling the world to minimise uncertainty H ( S , ) H ( S | m) H ( | S ) Et [ ln p( s (t ) | m)] Et [ H ( | S s (t ))] Free energy principle minimise uncertainty (t ) arg min{H [ q( | , )]} (t ) s (t ) S S( ) H [q( | , )] stimulus visual input salience Perception as hypothesis testing – saccades as experiments sampling Hidden states in the world ω Sensations Internal states of the agent s g ( ψ, a ) ω s Fluctuations Posterior expectations ψ f ( ψ, a ) ω x arg min F ( s , ) External states a arg min a F (s , μ) arg min H [q( | , )] Prior expectations Action x, p Parietal (where) Frontal eye fields u xp x, p u Visual cortex v ,q sq x ,q Pulvinar salience map x ,q Fusiform (what) xp S( ) a oculomotor reflex arc v, p sp Superior colliculus Saccadic eye movements Saccadic fixation and salience maps Action (EOG) 2 Hidden (oculomotor) states 0 -2 200 400 600 800 time (ms) 1000 1200 1400 1000 1200 1400 Visual samples Posterior belief 5 Conditional expectations about hidden (visual) states 0 -5 200 And corresponding percept 400 600 800 time (ms) Beliefs about beliefs: beliefs about precision If beliefs cause movement, how can I move when sensory evidence compels me to believe that I am not moving? Sensory attenuation, illusions and agency Making your own sensations s x s p i s ss xi xe xi vi 14 xi x x 1 xe ve 4 xe a s xi s p ωs ss x i v e x xi ( a ) 14 xi ω x v v i v ve sp ω s ~ N (0, e 8 I ) ω x ~ N (0, e 8 I ) ss s ~ N (0, e I ) x ~ N (0, e 4 I ) xi Generative process ve 8 ( xi vi ) v ~ N (0, e 6 I ) Generative model x v sensorimotor cortex descending predictions prefrontal cortex x v descending modulation thalamus ascending prediction errors v,s ss v, p a sp ss Motor reflex arc prediction and error hidden states 2 xi 2 1.5 ss sp 1 1.5 1 0.5 0.5 0 0 -0.5 -0.5 5 10 15 20 25 30 xe 5 10 Time (bins) 15 20 25 30 Time (bins) High sensory attenuation hidden causes 1 perturbation and action 1 vi a 0.8 0.6 0.5 ve 0.4 0.2 0 0 -0.2 -0.4 -0.5 -0.6 -0.8 5 10 15 20 Time (bins) 25 30 5 10 15 20 Time (bins) 25 30 prediction and error hidden states 2 2 1.5 1.5 1 1 0.5 0.5 0 0 -0.5 -0.5 5 10 15 20 25 30 5 10 time 15 20 25 30 time Low sensory attenuation hidden causes perturbation and action 1 1 0.8 0.6 0.5 0.4 0.2 0 0 -0.2 -0.4 -0.5 -0.6 -0.8 5 10 15 time 20 25 30 5 10 15 time 20 25 30 prediction and error 2 hidden states 2 1.5 hidden states prediction and error 2 2 1.5 1.5 1 1 0.5 0.5 0 0 -0.5 -0.5 1.5 1 1 0.5 0.5 0 0 -0.5 -0.5 10 20 30 40 50 60 10 20 Time (bins) 30 40 50 60 10 20 Time (bins) 30 Sensory attenuation hidden causes 2 2 40 50 60 10 perturbation and action hidden causes 1.5 1 1 0.5 0.5 1 0.5 0.5 0 0 0 0 -0.5 -0.5 30 40 Time (bins) 50 60 -0.5 10 20 50 60 perturbation and action 1.5 1 20 40 Force matching illusion 1.5 10 30 Time (bins) 1.5 -0.5 20 Time (bins) 30 40 Time (bins) 50 60 10 20 30 40 Time (bins) 50 60 10 20 30 40 Time (bins) 50 60 Failures of sensory attenuation, with compensatory increases in non-sensory precision 3 Simulated Empirical (Shergill et al) Self-generated(matched) force Self-generated(matched) force 2.5 2 1.5 1 0.5 0 0 0.5 1 1.5 2 External (target) force 2.5 3 External (target) force prediction and error 3.5 3 3 2.5 2.5 2 2 1.5 1.5 1 1 0.5 0.5 0 0 -0.5 10 20 30 40 50 hidden states 3.5 60 -0.5 10 20 Time (bins) 30 40 50 60 Time (bins) A failure of sensory attenuation and delusions of control hidden causes 3.5 3.5 3 3 2.5 2.5 2 2 1.5 1.5 1 1 0.5 0 0.5 -0.5 0 -1 perturbation and action 10 20 30 40 Time (bins) 50 60 -0.5 10 20 30 40 Time (bins) 50 60 Thank you And thanks to collaborators: Rick Adams Andre Bastos Sven Bestmann Jean Daunizeau Mark Edwards Harriet Brown Lee Harrison Stefan Kiebel James Kilner Jérémie Mattout Rosalyn Moran Will Penny Klaas Stephan And colleagues: Andy Clark Peter Dayan Jörn Diedrichsen Paul Fletcher Pascal Fries Geoffrey Hinton James Hopkins Jakob Hohwy Henry Kennedy Paul Verschure Florentin Wörgötter And many others Searching to test hypotheses – life as an efficient experiment H ( S , ) H ( S | m) H ( | S ) Et [ ln p( s (t ) | m)] Et [ H ( | S s (t ))] Free energy principle minimise uncertainty (t ) arg min{H [ q( | , )]} Time-scale Free-energy minimisation leading to… 10 3 s Perception and Action: The optimisation of neuronal and neuromuscular activity to suppress prediction errors (or freeenergy) based on generative models of sensory data. 100 s 103 s 106 s 1015 s Learning and attention: The optimisation of synaptic gain and efficacy over seconds to hours, to encode the precisions of prediction errors and causal structure in the sensorium. This entails suppression of free-energy over time. Neurodevelopment: Model optimisation through activitydependent pruning and maintenance of neuronal connections that are specified epigenetically Evolution: Optimisation of the average free-energy (free-fitness) over time and individuals of a given class (e.g., conspecifics) by selective pressure on the epigenetic specification of their generative models.