Models of effective connectivity & Dynamic Causal Modelling (DCM) Klaas Enno Stephan Laboratory for Social & Neural Systems Research Institute for Empirical Research in Economics University of Zurich Functional Imaging Laboratory (FIL) Wellcome Trust Centre for Neuroimaging University College London Methods & Models for fMRI data analysis in neuroeconomics 09 December 2009 Overview • Brain connectivity: types & definitions – anatomical connectivity – functional connectivity – effective connectivity • Psycho-physiological interactions (PPI) • Dynamic causal models (DCMs) – DCM for fMRI: Neural and hemodynamic levels – Parameter estimation & inference • Applications of DCM to fMRI data – Design of experiments and models – Some empirical examples and simulations Connectivity A central property of any system Communication systems Social networks (internet) (Canberra, Australia) FIgs. by Stephen Eick and A. Klovdahl; see http://www.nd.edu/~networks/gallery.htm Structural, functional & effective connectivity • anatomical/structural connectivity = presence of axonal connections Sporns 2007, Scholarpedia • functional connectivity = statistical dependencies between regional time series • effective connectivity = causal (directed) influences between neurons or neuronal populations Anatomical connectivity • neuronal communication via synaptic contacts • visualisation by tracing techniques • long-range axons “association fibres” Diffusion-weighted imaging Parker & Alexander, 2005, Phil. Trans. B Diffusion-weighted imaging of the corticospinal tract Parker, Stephan et al. 2002, NeuroImage Why would complete knowledge of anatomical connectivity not be enough to understand how the brain works? Connections are recruited in a context-dependent fashion 0.4 0.3 0.2 0.1 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0.6 0.4 0.2 0 0.3 0.2 0.1 0 Synaptic strengths are context-sensitive: They depend on spatio-temporal patterns of network activity. Connections show plasticity • synaptic plasticity = change in the structure and transmission properties of a chemical synapse NMDA receptor • critical for learning • can occur both rapidly and slowly • NMDA receptors play a critical role • NMDA receptors are regulated by modulatory neurotransmitters like dopamine, serotonine, acetylcholine Gu 2002, Neuroscience Short-term plasticity • NMDAR-independent – e.g. synaptic depression – non-inactivating sodium channels Tsodyks & Markram 1997, PNAS • NMDAR-dependent peak PSP (mV) – phosphorylation of AMPARs – modulation of EPSPs at NMDARs by DA, ACh, 5HT (gating) Reynolds et al. 2001, Nature Different approaches to analysing functional connectivity • Seed voxel correlation analysis • Eigen-decomposition (PCA, SVD) • Independent component analysis (ICA) • any other technique describing statistical dependencies amongst regional time series Seed-voxel correlation analyses • Very simple idea: – hypothesis-driven choice of a seed voxel → extract reference time series – voxel-wise correlation with time series from all other voxels in the brain seed voxel Drug-induced changes in functional connectivity Finger-tapping task in first-episode schizophrenic patients: voxels that showed changes in functional connectivity (p<0.005) with the left ant. cerebellum after medication with olanzapine Stephan et al. 2001, Psychol. Med. Does functional connectivity not simply correspond to co-activation in SPMs? No, it does not - see the fictitious example on the right: regional response A1 task T regional response A2 Here both areas A1 and A2 are correlated identically to task T, yet they have zero correlation among themselves: r(A1,T) = r(A2,T) = 0.71 but r(A1,A2) = 0 ! Stephan 2004, J. Anat. Pros & Cons of functional connectivity analyses • Pros: – useful when we have no experimental control over the system of interest and no model of what caused the data (e.g. sleep, hallucinatons, etc.) • Cons: – interpretation of resulting patterns is difficult / arbitrary – no mechanistic insight into the neural system of interest – usually suboptimal for situations where we have a priori knowledge and experimental control about the system of interest models of effective connectivity necessary For understanding brain function mechanistically, we need models of effective connectivity, i.e. models of causal interactions among neuronal populations. Some models for computing effective connectivity from fMRI data • Structural Equation Modelling (SEM) McIntosh et al. 1991, 1994; Büchel & Friston 1997; Bullmore et al. 2000 • regression models (e.g. psycho-physiological interactions, PPIs) Friston et al. 1997 • Volterra kernels Friston & Büchel 2000 • Time series models (e.g. MAR, Granger causality) Harrison et al. 2003, Goebel et al. 2003 • Dynamic Causal Modelling (DCM) bilinear: Friston et al. 2003; nonlinear: Stephan et al. 2008 Overview • Brain connectivity: types & definitions – anatomical connectivity – functional connectivity – effective connectivity • Psycho-physiological interactions (PPI) • Dynamic causal models (DCMs) – DCM for fMRI: Neural and hemodynamic levels – Parameter estimation & inference • Applications of DCM to fMRI data – Design of experiments and models – Some empirical examples and simulations Psycho-physiological interaction (PPI) Stim 1 Stim 2 Stimulus factor Task factor Task A Task B TA/S1 TB/S1 TA/S2 TB/S2 We can replace one main effect in the GLM by the time series of an area that shows this main effect. E.g. let's replace the main effect of stimulus type by the time series of area V1: Friston et al. 1997, NeuroImage GLM of a 2x2 factorial design: y (TA TB ) 1 main effect of task ( S1 S 2 ) β 2 main effect of stim. type (TA TB ) ( S1 S 2 ) β 3 interaction e y (TA TB ) 1 V 1β 2 (TA TB ) V 1β 3 e main effect of task V1 time series main effect of stim. type psychophysiological interaction Attentional modulation of V1→V5 Attention V1 V5 V5 activity SPM{Z} time V1 x Att. Friston et al. 1997, NeuroImage Büchel & Friston 1997, Cereb. Cortex V5 V5 activity = attention no attention V1 activity PPI: interpretation y (TA TB ) 1 V 1β 2 (TA TB ) V 1β 3 Two possible interpretations of the PPI term: e attention V1 attention V5 Modulation of V1V5 by attention V1 V5 Modulation of the impact of attention on V5 by V1 Two PPI variants • "Classical" PPI: – Friston et al. 1997, NeuroImage – depends on factorial design – in the GLM, physiological time series replaces one experimental factor – physio-physiological interactions: two experimental factors are replaced by physiological time series • Alternative PPI: – Macaluso et al. 2000, Science – interaction term is added to an existing GLM – can be used with any design Task-driven lateralisation Does the word contain the letter A or not? letter decisions > spatial decisions • • • group analysis (random effects), n=16, p<0.05 corrected analysis with SPM2 Is the red letter left or right from the midline of the word? spatial decisions > letter decisions Stephan et al. 2003, Science Bilateral ACC activation in both tasks – but asymmetric connectivity ! group analysis random effects (n=15) p<0.05, corrected (SVC) IFG left ACC (-6, 16, 42) letter vs spatial decisions Left ACC left inf. frontal gyrus (IFG): increase during letter decisions. IPS right ACC (8, 16, 48) Stephan et al. 2003, Science spatial vs letter decisions Right ACC right IPS: increase during spatial decisions. PPI single-subject example bVS= -0.16 spatial decisions bL=0.63 Signal in right ant. IPS Signal in left IFG letter decisions spatial decisions bL= -0.19 letter decisions bVS=0.50 Signal in left ACC Signal in right ACC Left ACC signal plotted against left IFG Right ACC signal plotted against right IPS Stephan et al. 2003, Science PPI for event-related fMRI requires deconvolution (A HRF) (B HRF) (A B) HRF Gitelman et al. 2003, NeuroImage Pros & Cons of PPIs • Pros: – given a single source region, we can test for its context-dependent connectivity across the entire brain – easy to implement • Cons: – very simplistic model: only allows to model contributions from a single area – ignores time-series properties of data – application to event-related data relies deconvolution procedure (Gitelman et al. 2003, NeuroImage) – operates at the level of BOLD time series sometimes very useful, but limited causal interpretability; in most cases, we need more powerful models Overview • Brain connectivity: types & definitions – anatomical connectivity – functional connectivity – effective connectivity • Psycho-physiological interactions (PPI) • Dynamic causal models (DCMs) – DCM for fMRI: Neural and hemodynamic levels – Parameter estimation & inference • Applications of DCM to fMRI data – Design of experiments and models – Some empirical examples and simulations Dynamic causal modelling (DCM) • DCM framework was introduced in 2003 for fMRI by Karl Friston, Lee Harrison and Will Penny (NeuroImage 19:1273-1302) • part of the SPM software package • currently more than 100 published papers on DCM Dynamic Causal Modeling (DCM) Hemodynamic forward model: neural activityBOLD Electromagnetic forward model: neural activityEEG MEG LFP Neural state equation: dx F ( x , u, ) dt fMRI simple neuronal model complicated forward model Stephan & Friston 2007, Handbook of Brain Connectivity EEG/MEG complicated neuronal model simple forward model inputs Example: a linear system of dynamics in visual cortex x3 x1 FG left LG left FG right LG right x4 LG = lingual gyrus FG = fusiform gyrus x2 Visual input in the - left (LVF) - right (RVF) visual field. RVF LVF u2 u1 x1 a11 x1 a12 x2 a13 x3 c12u2 x2 a21 x1 a22 x2 a24 x4 c21u1 x3 a31 x1 a33 x3 a34 x4 x4 a42 x2 a43 x3 a44 x4 Example: a linear system of dynamics in visual cortex x3 x1 FG left LG left { A, C} LG right x4 LG = lingual gyrus FG = fusiform gyrus x2 Visual input in the - left (LVF) - right (RVF) visual field. RVF LVF u2 u1 state changes x Ax Cu FG right effective connectivity system state input parameters external inputs x1 a11 a12 a13 0 x1 0 c12 x c x a a u 0 a 0 1 24 2 21 2 21 22 x3 a31 0 a33 a34 x3 0 0 u2 0 a a a x x 0 0 42 43 44 4 4 Extension: bilinear dynamic system x3 FG left FG right x4 m x ( A u j B ( j ) ) x Cu j 1 x1 LG left LG right x2 RVF CONTEXT LVF u2 u3 u1 0 b12(3) x1 a11 a12 a13 0 x a a 0 a 0 0 24 2 21 22 u3 0 0 x3 a31 0 a33 a34 0 a a a x 42 43 44 4 0 0 0 0 0 0 b34(3) 0 0 0 x1 0 c12 x c 0 2 21 x3 0 0 x4 0 0 0 u1 0 u2 0 u3 0 y y BOLD y activity x2(t) neuronal states t Neural state equation intrinsic connectivity modulation of connectivity direct inputs Stephan & Friston (2007), Handbook of Brain Connectivity hemodynamic model x integration modulatory input u2(t) t λ activity x3(t) activity x1(t) driving input u1(t) y x ( A u j B( j ) ) x Cu x x x u j x A B( j) x C u Bilinear DCM driving input modulation Two-dimensional Taylor series (around x0=0, u0=0): dx f f 2 f f ( x, u) f ( x0 ,0) x u ux ... dt x u xu Bilinear state equation: m dx A ui B ( i ) x Cu dt i 1 Example: context-dependent decay stimuli u1 context u2 + - x1 + u1 u1 u2 u2 Z1 x Z2 1 x2 + x2 - x Ax u2 B (2) x Cu1 - Penny, Stephan, Mechelli, Friston NeuroImage (2004) x1 x a 21 2 2 a12 b11 x u 2 0 0 c1 0 u1 x u 2 0 0 b 22 2 DCM parameters = rate constants Integration of a first-order linear differential equation gives an exponential function: dx ax dt x(t ) x0 exp(at ) Coupling parameter a is inversely proportional to the half life of z(t): x( ) 0.5 x0 The coupling parameter a thus describes the speed of the exponential change in x(t) 0.5x0 x0 exp( a ) a ln 2 / ln 2 / a The problem of hemodynamic convolution Goebel et al. 2003, Magn. Res. Med. Hemodynamic forward models are important for connectivity analyses of fMRI data Granger causality DCM David et al. 2008, PLoS Biol. The hemodynamic model in DCM u t m dx A u j B ( j ) x Cu dt j 1 • 6 hemodynamic parameters: { , , , , , } stimulus functions neural state equation vasodilatory signal s x s γ( f 1) h f s s flow induction (rCBF) f s important for model fitting, but of no interest for statistical inference Balloon model changes in volume • Computed separately for each area (like the neural parameters) region-specific HRFs! τv f v1 /α v ( q, v ) v changes in dHb τq f E ( f,E0 ) qE0 v1 /α q/v q S q V0 k1 1 q k2 1 k3 1 v S0 v k1 4.30 E0TE Friston et al. 2000, NeuroImage Stephan et al. 2007, NeuroImage hemodynamic state equations f k2 r0 E0TE k3 1 BOLD signal change equation u The hemodynamic model in DCM stimulus functions t m dx A u j B ( j ) x Cu dt j 1 neural state equation 0.4 0.2 vasodilatory signal 0 s x s γ( f 1) f 0 2 4 6 8 10 12 s s N RBM N, = 1 CBM , = 1 N RBM , = 2 1 flow induction (rCBF) 0.5 f s hemodynamic state equations N CBM N, = 2 0 f Balloon model changes in volume τv f v1 /α v ( q, v ) 14 RBM N, = 0.5 CBM , = 0.5 v 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 0.2 changes in dHb τq f E ( f,E0 ) qE0 v1/α q/v q 0 -0.2 -0.4 -0.6 S q V0 k1 1 q k2 1 k3 1 v S0 v k1 4.30 E0TE k2 r0 E0TE k3 1 BOLD signal change equation Stephan et al. 2007, NeuroImage How interdependent are neural and hemodynamic parameter estimates? 1 A 0.8 5 0.6 10 B 0.4 15 C 0.2 20 0 25 -0.2 h ε 30 -0.4 35 -0.6 -0.8 40 5 10 15 20 25 30 35 40 -1 Stephan et al. 2007, NeuroImage Bayesian statistics new data prior knowledge p( y | ) p ( ) p( | y ) p( y | ) p( ) posterior likelihood ∙ prior Bayes theorem allows one to formally incorporate prior knowledge into computing statistical probabilities. In DCM: empirical, principled & shrinkage priors. The “posterior” probability of the parameters given the data is an optimal combination of prior knowledge and new data, weighted by their relative precision. stimulus function u Overview: parameter estimation • • • • Combining the neural and hemodynamic states gives the complete forward model. An observation model includes measurement error e and confounds X (e.g. drift). Bayesian parameter estimation by means of a Levenberg-Marquardt gradient ascent, embedded into an EM algorithm. Result: Gaussian a posteriori parameter distributions, characterised by mean ηθ|y and covariance Cθ|y. neural state equation x ( A u j B j ) x Cu activity - dependent vasodilatory signal s z s γ( f 1) s s f flow - induction (rCBF) hidden states z {x, s, f , v, q} state equation h { , , , , } f n { A, B1...B m , C} { h , n } z F ( x , u, ) changes in volume τv f v1/α v ηθ|y parameters f s v changes in dHb τq f E ( f, ) q v1/α q/v q y (x ) y h(u, ) X e modelled BOLD response observation model Inference about DCM parameters: Bayesian single-subject analysis • Gaussian assumptions about the posterior distributions of the parameters • Use of the cumulative normal distribution to test the probability that a certain parameter (or contrast of parameters cT ηθ|y) is above a chosen threshold γ: cT y p N cT C y c • By default, γ is chosen as zero ("does the effect exist?"). Bayesian single subject inference LD|LVF 0.13 0.19 FG left LD p(cT>0|y) = 98.7% 0.34 0.14 FG right 0.44 0.14 0.29 0.14 LG left 0.01 0.17 RVF stim. Stephan et al. 2005, Ann. N.Y. Acad. Sci. LD LG right -0.08 0.16 LD|RVF LVF stim. Contrast: Modulation LG right LG links by LD|LVF vs. modulation LG left LG right by LD|RVF Inference about DCM parameters: Bayesian fixed-effects group analysis Because the likelihood distributions from different subjects are independent, one can combine their posterior densities by using the posterior of one subject as the prior for the next: p ( | y1 ) p ( y1 | ) p ( ) p ( | y1 , y2 ) p ( y2 | ) p ( y1 | ) p( ) Under Gaussian assumptions this is easy to compute: group posterior covariance N C|1y1 ,..., y N C|1yi i 1 p ( y2 | ) p ( | y1 ) ... p ( | y1 ,..., y N ) p ( y N | ) p( | y N 1 )... p( | y1 ) individual posterior covariances | y ,..., y 1 group posterior mean N N 1 1 C | yi | yi C | y1 ,..., y N i 1 individual posterior covariances and means Inference about DCM parameters: group analysis (classical) • In analogy to “random effects” analyses in SPM, 2nd level analyses can be applied to DCM parameters: Separate fitting of identical models for each subject Selection of bilinear parameters of interest one-sample t-test: parameter > 0 ? paired t-test: parameter 1 > parameter 2 ? rmANOVA: e.g. in case of multiple sessions per subject Overview • Brain connectivity: types & definitions – anatomical connectivity – functional connectivity – effective connectivity • Psycho-physiological interactions (PPI) • Dynamic causal models (DCMs) – DCM for fMRI: Neural and hemodynamic levels – Parameter estimation & inference • Applications of DCM to fMRI data – Design of experiments and models – Some empirical examples and simulations What type of design is good for DCM? Any design that is good for a GLM of fMRI data. GLM vs. DCM DCM tries to model the same phenomena as a GLM, just in a different way: It is a model, based on connectivity and its modulation, for explaining experimentally controlled variance in local responses. No activation detected by a GLM → inclusion of this region in a DCM is useless! Stephan 2004, J. Anat. Multifactorial design: explaining interactions with DCM Stim 1 Stim 2 Stimulus factor Task factor Stim1/ Task A Stim2/ Task A Task A Task B TA/S1 TB/S1 X1 X2 TA/S2 TB/S2 Stim 1/ Task B Stim 2/ Task B X1 X2 Let’s assume that an SPM analysis shows a main effect of stimulus in X1 and a stimulus task interaction in X2. Stim1 How do we model this using DCM? Stim2 Task A Task B GLM DCM Simulated data X1 Stimulus 1 – +++ – + X1 Stimulus 2 + +++ +++ Task A X2 Stim 1 Task A + Task B X2 Stephan et al. 2007, J. Biosci. Stim 2 Task A Stim 1 Task B Stim 2 Task B X1 Stim 1 Task A Stim 2 Task A Stim 1 Task B Stim 2 Task B X2 plus added noise (SNR=1) Example studies of DCM for fMRI • DCM now an established tool for fMRI & M/EEG analysis • >100 studies published, incl. highprofile journals • combinations of DCM with computational models Task-driven lateralisation Does the word contain the letter A or not? letter decisions > spatial decisions • • • group analysis (random effects), n=16, p<0.05 corrected analysis with SPM2 Is the red letter left or right from the midline of the word? spatial decisions > letter decisions Stephan et al. 2003, Science Theories on inter-hemispheric integration during lateralised tasks Information transfer Hemispheric recruitment T|RVF T (for left-lateralised task) + + T|LVF LVF Inhibition/Competition T + T − − T RVF Predictions: Predictions: Predictions: modulation by task conditional on visual field asymmetric connection strengths modulation by task only positive & symmetric connection strengths modulation by task only negative & symmetric connection strengths A FG left LG left LG right RVF stim. B C FG right FG left VF RVF LD LD VF LD Bind Bcond LD Bind LD,RVF LD,LVF Bcond LD|RVF LD|LVF LG left VF D VF LD Bind Bcond inter LVF LVF stim. intra FG right LVF LD LD LG right Bind Bcond LVF LD LD|LVF FG left FG right LG left LG right 16 models RVF RVF LD|RVF Ventral stream & letter decisions LD|LVF Left fusiform gyrus (FG) -44,-52,-18 0.12 0.02 FG left LD>SD, p<0.05 cluster-level corrected (p<0.001 voxel-level cut-off) LD Right fusiform gyrus (FG) 38,-52,-20 0.25 0.04 FG right 0.36 0.06 0.16 0.05 LG left Left lingual gyrus (LG) -12,-64,-4 0.02 0.02 RVF stim. LD LG right Right lingual gyrus (LG) 14,-68,-2 0.03 0.03 LD|RVF p<0.01 uncorrected LVF stim. mean parameter estimates SE (n=12) LD>SD masked incl. with RVF>LVF p<0.05 cluster-level corrected (p<0.001 voxel-level cut-off) Stephan et al. 2007, J. Neurosci. significant modulation (p<0.05, Bonferroni-corrected) significant modulation (p<0.05, uncorrected) non-significant modulation LD>SD masked incl. with LVF>RVF p<0.05 cluster-level corrected (p<0.001 voxel-level cut-off) Ventral stream & letter decisions Left MOG -38,-90,-4 Left FG -44,-52,-18 mean parameter estimates SE (n=12) significant modulation (p<0.05, corrected) significant modulation (p<0.05, uncorrected) non-significant Right FG 38,-52,-20 LD|LVF 0.20 0.04 0.00 0.01 0.07 0.02 LD>SD, p<0.05 cluster-level corrected (p<0.001 voxel-level cut-off) p<0.01 uncorrected MOG left LD Left LG -12,-70,-6 FG left FG right MOG right 0.27 0.06 0.11 0.03 0.00 0.04 0.01 0.03 LG left 0.01 0.01 LD>SD masked incl. with RVF>LVF p<0.05 cluster-level corrected (p<0.001 voxel-level cut-off) Right MOG -38,-94,0 RVF stim. 0.01 0.01 LG right LD Left LG -14,-68,-2 0.06 0.02 LD|RVF LVF stim. LD>SD masked incl. with LVF>RVF p<0.05 cluster-level corrected (p<0.001 voxel-level cut-off) Stephan et al. 2007, J. Neurosci. Asymmetric modulation of LG callosal connections is consistent across subjects 0.5 left to right right to left 0.4 MAP estimate 0.3 0.2 0.1 0.0 -0.1 1 2 3 4 5 6 7 8 9 10 11 12 -0.2 Subjects Stephan et al. 2007, J. Neurosci. Incidental learning of audio-visual associations Auditory or Visual “Distractor” “Distractor” Target Target Fixation cross 0 200 Auditory 400 600 80% Visual 800 1000 Time (ms) 80% Fixation cross 1200 1400 2000 ± 500 Hypothesis: Incidental learning of this relation is reflected by prediction-error dependent changes in connectivity between auditory and visual areas. den Ouden et al. 2009, Cereb. Cortex Incidental learning of audio-visual associations 2x2x2 factorial design (differential classical conditioning) 10% 10% 40% p(VS|T1) = 0.8 p(VS|T1) = 0.2 absent present present 40% present 10% 40% absent absent CS-: Tone 2 predicts absence of VS Tone 2 (T2) Visual stimulus present absent Visual stimulus CS+: Tone 1 predicts presence of VS Tone 1 (T1) 40% 10% 50% trials with auditory stimuli, 50% trials with visual stimuli p(VS|T2) = 0.2 p(VS|T2) = 0.8 Rescorla-Wagner model of associative learning During learning, predictive tones - V1 & PUT increasingly activate the more surprising the visual outcome is - V1 & PUT increasingly deactivate the more expected the visual outcome is 1 0.9 0.8 Association 0.7 + + + - - + - - 0.6 CS A 0.5 CS A ME Visual 1 0.5 CS A 0.4 CS A V1 0.3 0.2 0 -0.5 -1 PUT -1.5 -2 0.1 0 Beta (% signal change) Rescorla-Wagner learning curve (=0.075): 0 100 200 300 400 500 Trial 600 700 800 prediction outcome prediction den Ouden et al. 2009, Cereb. Cortex A+V+ A+V- A-V+ A-V- A+V+ A+V- A-V+ A-V- p < 0.05, corrected random effects, n=16 A very simple neurocomputational model ME Auditory ME Visual 3 auditory stimuli 0.1 VIS CTX -0.01 prediction error (computational model) + CS - 2 1.5 1 0.5 visual stimuli -0.5 0 -0.5 -1 -1.5 0 -2 A+V+ A+V- A-V+ A-V- A+V+ A+V- A-V+ A-V- A1 aud. input (A+) p = 0.028 0.5 Beta (% signal change) AUD CTX Beta (% signal change) 2.5 1 CS 0.10 -0.01 A+V+ A+V- A-V+ A-V- A+V+ A+V- A-V+ A-V- V1 vis. input (V+) CS+ (V+ vs. V-) Learning changed the coupling strength of the A1V1 connection by +2% (for unexpected VS) and -8% (for expected VS). den Ouden et al. 2009, Cereb. Cortex Thank you