Modelling regional and psychophysiologic interactions in fMRI

advertisement
Technical Note:
Modeling regional and psychophysiologic interactions in fMRI:
The importance of hemodynamic deconvolution
Darren R. Gitelman1,2,3,4, William D. Penny5, John Ashburner5, Karl J. Friston5
The Northwestern Cognitive Brain Mapping Group, Cognitive Neurology and Alzheimer's
Disease Center1, Departments of Neurology2, and Radiology3, Northwestern University Medical
School, Chicago V.A. Healthcare System Lakeside Division4, Chicago, Illinois, 60611, and the
Wellcome Department of Imaging Neuroscience5, London, UK.
To Whom Correspondence Should be Addressed:
Darren R. Gitelman, M.D.
Northwestern University Medical School
320 E. Superior St., Searle 11-470
Chicago, IL 60611
Voice: 312-908-9023
Fax: 312-908-8789
Email: d-gitelman@northwestern.edu
printed: 2/5/2016
1
ABSTRACT
The analysis of functional magnetic resonance imaging (fMRI) time-series data can provide
information not only about task-related activity, but also about the connectivity (functional or
effective) among regions, and the influences of behavioral or physiologic states on that
connectivity (Büchel and Friston 1997). Similar analyses have been performed in other imaging
modalities, such as positron emission tomography (PET) (McIntosh, et al. 1994). However,
fMRI is unique because the information about the underlying neuronal activity is filtered or
convolved with a hemodynamic response function. Previous studies of regional connectivity in
fMRI have overlooked this convolution and have assumed that the observed hemodynamic
response approximates the neuronal response. In this paper, this assumption is revisited using
estimates of underlying neuronal activity. These estimates use a Parametric Empirical Bayes
formulation for hemodynamic deconvolution.
printed: 2/5/2016
2
INTRODUCTION
A common objective in functional imaging is to characterize the activity in a particular
brain region in terms of the interactions among inputs from other regions or by the interaction
between inputs from another region’s activity and behavioral state. Examples of analyses
modeling these interactions include psychophysiologic interactions (PPI), physiophysiologic
interactions and the incorporation of moderator variables in structural equation modeling (SEM)
(Friston, et al. 1997; Büchel, et al. 1999). Analyzing functional magnetic resonance imaging
(fMRI) data presents a unique challenge to these techniques because the experimenter is
presented with a time-series that represents the neural signal convolved with some hemodynamic
response function (HRF). However, interactions in the brain are expressed, not at the level of
hemodynamic responses, but at a neural level. Therefore, veridical models of neuronal
interactions require the neural signal or at least a well-constrained approximation to it. Given
blood oxygen level dependent (BOLD) signal in fMRI, the appropriate approximation can be
obtained by deconvolution using an assumed hemodynamic response. The need for robust
deconvolution is motivated neurobiologically (because brain interactions take place at a neural
level) and mathematically (because modeling interactions at a hemodynamic level is not
equivalent to modeling them neuronally).
Only a few previous manuscripts have commented on the role of deconvolution when
analyzing fMRI data. Glover used Wiener deconvolution, and noted that the deconvolved BOLD
signal more closely tracked the experimental design (Glover 1999). However, Wiener filtering
requires independent measurement of the noise spectral density, and it will be shown represents a
special and limited case of the approach we describe. Zarahn outlined a number of pitfalls in
deconvolution and applied a least-squares deconvolution method to particular time periods in a
trial (Zarahn 2000). This procedure also represents a special case of the technique we describe,
and required the assumption of specific noise estimates.
This note reviews the mathematical theory, presents a simple method for constrained
deconvolution, and demonstrates the method using simulated and empirical datasets.
THEORY
Brain interactions occur at a neuronal level, yet the signal observed in fMRI is the hemodynamic
response engendered by that neuronal activity. The problem is how to construct regressors,
printed: 2/5/2016
3
based on hemodynamic observations, which model neuronal influences. The hemodynamic
response to neural activity can be modeled by convolution with a finite impulse response
function
yt   h xt 

(1)
where yt is the measured BOLD signal at time t, h is the hemodynamic (impulse) response
function defined at times , and xt- is the neuronal signal at time t minus . This equation can be
reformulated in matrix notation as
y  Hx
(2)
where H is the HRF in Toeplitz matrix form.
Assume that we wish to model the interaction between neural activities in two areas.
Given the BOLD signal from two regions yA and yB we might try taking the product at each point
in time to generate the interaction term; however, this would not produce the signal induced by
interactions at a neuronal level (i.e., xA and xB representing neuronal activity from two regions),
since we have neglected convolution with the HRF. This can be expressed mathematically for
BOLD responses (3), and for the interaction of a psychological variable with BOLD signal (4).
y A y B  ( Hx A )( Hx B )  H ( x A xB )
(3)
HPy A  ( HP)( Hx A )  H ( PxA )
(4)
The proper form of the necessary physiologic variables (e.g., xA) must first be obtained
from the observed BOLD activity (i.e.., yA). The first step is to expand xA in terms of a temporal
basis set X such as a Fourier series or a set of cosine functions
x A  X
(5)
where  are parameters that control the expression of different frequency components of xA.
Introducing observation noise, we create a linear model whose maximum likelihood estimators
are obtained through the Gauss-Markov theorem
printed: 2/5/2016
4
y A  HX  
(6)
ˆ ML  ( X T H T  1 HX ) 1 X T H T  1 y A
(7)
where  ~ N (0,  2 ) , 2 is the error variance, and  is the error correlation matrix. Using the
estimates, ̂ ML , one could then derive the neural signal for each BOLD vector and compute the
required interaction terms using the left hand expressions in equations 3 or 4.
However, estimates based on a full-rank basis set are generally very inefficient (i.e.,
highly variable). This is particularly so for the coefficients in ̂ controlling high frequencies,
because the HRF selectively attenuates high frequencies. The estimates therefore have to be
constrained or regularized. This could be achieved, for example, by setting the high-frequency
estimates to zero thus removing high-frequency terms from the basis set. This would give a least
squares maximum likelihood deconvolution that is motivated by smoothness constraints on the
solution.
Constraints like this could also be implemented using a Bayesian formulation in which
the estimated neuronal activity conforms to a Bayesian estimator with priors on its frequency
structure. The notation and form of the equations are consistent with the derivations presented in
Friston et al (2002) to which the interested reader is referred for a full derivation (Friston, et al.
2002). The advantage of the Bayesian formulation is that it allows a more flexible specification
of the priors. The corresponding Bayesian estimator (under Gaussian assumptions) is
ˆ MAP  ( X T H T  1 HX   2C1 ) 1 X T H T  1 y A
(8)
where ˆ MAP is the maximum a posteriori estimate, and C 1 is the prior precision (inverse of the
prior covariance) of  . By setting the leading diagonal elements of C  , corresponding to the
highest frequencies, to infinity (i.e., these coefficients have zero mean and variance) the solution
is effectively restricted to lower frequencies. This would render (8) equivalent to the least
squares deconvolution described above, but with the highest frequencies removed from X. Note
that if the prior expectation of a parameter is zero and its prior precision is infinite, then it can
only assume a value of zero. Even if the corresponding columns of X enter into the estimation
model, the parameter estimates will still be zero. In other words, the results will be the same as
if these particular columns of X had been removed from the original model.
printed: 2/5/2016
5
.


C  
0

0
e.g.
 0 0
0


  
1
 C  
0
  0


 0 0
0
0
 
0

 0 
 0
 
 0
This Bayesian formulation is potentially useful because it accommodates more finessed
regularizations than simply removing columns from X . The simplest constraint (used in the
examples below) is to assume the underlying neuronal process is white, with uniform expression
of all frequencies:
e.g.
1 0  0
0 1
 
C   
 Q

 0


0  0 1 
(9)
This is a more graceful form of constraint in which all frequencies are estimated, but in a way
that ensures the frequency profile conforms roughly to the spectral profile specified in the prior
covariances. Note that a hyperparameter,  , has been introduced, which controls the prior
variance, given that the relative variances conform to the leading diagonal of Q. We could
assume a particular value for  based on expectations about the variance of xA. This would
render ˆ MAP a fully Bayesian estimator. However, it is difficult to specify  for all situations,
and a more flexible approach is to estimate it from the data. This leads to empirical Bayes
estimators in which  can be estimated using restricted maximum likelihood (ReML) as
described in Friston et al. (2002) using a simple hierarchical observation model and expectationmaximization (EM)
y A  HX   (1)
(10)
  0   ( 2)
where, as before, cov( (1) )   2  is the variance of the observation error and
cov( ( 2) )  Q  C  is the prior covariance of  which enters into (8) to give ˆ MAP (Friston, et
printed: 2/5/2016
6
al. 2002). Finally, ˆ MAP can then be used to calculate x̂ A , from equation 5, and form the
appropriate interaction terms for statistical modeling.
EXAMPLES
To illustrate the importance of deconvolving prior to forming interactions we will use the
interaction between two simulated event-related responses. When using the empirical Bayesian
approach, the basis set for deconvolution can be complete, or indeed over-complete. In this work
we have used a full rank discrete cosine set (as the explanatory coefficients), and a white noise
form for the prior constraints on C  . The examples described in this manuscript used the
standard HRF, consisting of two gamma functions, included in the SPM99 software. However, in
principle there is no reason why other HRFs could not be used (Zarahn, et al. 1999).
INSERT FIGURE 1
The examples also assumed an identical HRF across all the regions we modeled.
Although intersubject and interregional differences in the HRF have been previously described
(Aguirre, et al. 1998; Zarahn 2000), the use of a constrained HRF basis set spanning the space of
possible responses, is considered a reasonable solution to this problem (Friston, et al. 1998;
Zarahn 2000). We also note that differences in the form of the hemodynamic response function,
particularly in latencies, that occur either between subjects or regions could introduce biases in
the deconvolution estimators. This could prove particularly problematic if our experimental
paradigm induced high-frequency responses both at the neuronal and hemodynamic levels. For
this reason, it would seem sensible to employ experimental designs with relatively low frequency
experimental variance unless regional HRFs could be more accurately estimated.
The simulated data consisted of 115 scans, with a TR of 2 seconds. Figure 1A shows 20
event-related neural responses during a 230 second scanning session. Neural responses were
formed by convolving a stick function with an exponential decay having a time constant of 1000
msecs. Figure 1B shows responses to the same events from a different area. The only difference
between the two regions was the trial-to-trial latency in neural activity that varied uniformly
printed: 2/5/2016
7
between 0 and 8 seconds. Simulated neural activities from these two regions convolved with an
HRF are shown in figure 1C and 1D respectively.
The result of multiplying the hemodynamic responses, in these two regions, to generate
the interaction term is shown in figure 2A. In contrast, figure 2B shows the result of first
obtaining neural signals by deconvolution, multiplying the neural signals to obtain the interaction
term, and then reconvolving with a HRF. The results of these operations are clearly different
because the interaction term in figure 2A is completely insensitive to the relative onsets of the
neuronal activities that determine the degree of interaction.
INSERT FIGURE 2
The simulated BOLD data were noise free, whereas actual BOLD data contain a
substantial amount of noise, which could affect parameter estimation. To model observation
error, we added Gaussian noise having a standard deviation that was a fixed percentage of the
standard deviate of the simulated BOLD data. The original and deconvolved noise-free neuronal
signals are shown in figure 3A. The effects of adding 25%, 50% and 100% Gaussian noise on the
deconvolution is shown in figure 3B – D. The addition of Gaussian noise results in a relative loss
of high frequency components. This is because the Bayesian estimator depends more on the
priors as the level of noise increases. Although the noise estimator was white in the current
examples, high frequencies are selectively attenuated because they are less prevalent in the real
data (due to the effects of the hemodynamic response) and therefore shrink to their prior
expectation of zero.
INSERT FIGURE 3
To examine the effects of noise on the hemodynamic and neural interactions, neural event
vectors were constructed with 35 events. These vectors were then treated as outlined above,
including convolution with a HRF, addition of Gaussian noise, generation of the hemodynamic
interaction, deconvolution of the individual hemodynamic responses, generation of the neuronal
interaction, and reconvolution of the neuronal interaction with a HRF. Noise effects on the
BOLD versus neuronal interactions are shown in figure 4.
INSERT FIGURE 4
printed: 2/5/2016
8
The hemodynamic interactions (top panels) are much more affected by adding noise than
the neuronal interactions (lower panels). This example was chosen to make the distinction
between interactions at the neuronal and hemodynamic levels very apparent. In practice, the
distinction may be less pronounced because experimentally induced changes in neuronal activity
have fewer high frequency components, which attenuates the impact of the convolution. The
hemodynamic regressors in figures 2 and 4 are still valid regressors but the interactions they
model are at the hemodynamic, not the neuronal, level. In fact, the analysis of non-linearities in
hemodynamic responses presented in Friston et. al. (1998) used these sorts of regressors to
estimate second order Volterra kernels.
An application to empirical BOLD data is shown in figures 5 and 6. The event-related
data in figure 5 were taken from a lexical decision task (Gitelman, unpublished data), while the
block-design data in figure 6 are from a task of covert spatial attention (Gitelman, unpublished
data). As anticipated, the disparities between BOLD and neuronal interactions for event related
designs (Figure 5) are less prominent in block designs (Figure 6). Furthermore, real data may
show less pronounced differences between BOLD versus neuronal interactions than simulated
data because there are fewer high frequency components, particularly when block design data are
used. In order to demonstrate differences between hemodynamic and neuronal interactions,
correlations were examined between these interactions for each design type. The Pearson
correlation coefficient for the BOLD versus neuronal interaction for block design data (r =
0.948) was significantly better than that for event-related data (r = 0.709), based Fisher’s Ztransformation (z = 7.27, p < 1e-8).
INSERT FIGURES 5 and 6
DISCUSSION and CONCLUSION
This paper has illustrated the distinction between interactions among BOLD signals as
opposed to interactions occurring at a neuronal level. When modeling neural networks,
interactions occur at a neuronal level, thus, it is essential to generate the proper form of the
interaction term. The distinction between BOLD and neuronal interactions appears to be even
more prominent in the setting of high noise, further demonstrating the importance of
deconvolution for generally noisy fMRI data. Although there are potentially, simpler methods for
deconvolution, such as least squares or Wiener filtering, these methods provide rather harsh
printed: 2/5/2016
9
constraints and may show distortions with noisy data. We have motivated the use of an empirical
Bayesian formulation in order to generate well-constrained priors on the basis set, allowing the
specification of the prior in terms of the frequency structure of fMRI data.
An important issue is that the deconvolution procedure makes no assumptions about
whether the neural activity is task-related or due to noise. The deconvolution simply estimates
the best neuronal time course that, when convolved with the hemodynamic response function,
reproduces the observed data. There is no reference to task related stimulus functions or any
other specific cause of the underlying neuronal activity. This in contrast to previous approaches
(Glover 1999; Zarahn 2000) (but see software implementation note).
It is interesting to compare the empirical Bayesian deconvolution to related approaches
such as Wiener filtering. If we assume that the basis set X corresponds to a Fourier set then the
parameter estimates become the Fourier coefficients of the signal we wish to identify. Equation
(8) then can be rewritten, explicitly, in terms of the Fourier transforms of the data Y ( ) and
convolution kernel H ( ) in frequency space,
ˆ MAP 
H ( ) * Y ( )
 H ( )
2
 g  ( ) g  ( )

(11)
where g  ( ) is the prior spectral density, g  ( ) is the noise spectral density, and * denotes the
complex-conjugate
In the special case that g  ( )  1 , for all  , the empirical Bayesian
estimator implied by Equation (11) is formally identical to Wiener filtering as described in the
context of hemodynamic deconvolution, by Glover et al (1999). The implication of this is that
Wiener filtering is a special case of empirical Bayesian deconvolution that obtains when the prior
spectral density is unity. This prior corresponds to the simplest prior that could be used and is
equivalent to a minimum norm shrinkage prior on the Fourier coefficients of the underlying
signal. Thus, there are several advantages afforded by the empirical Bayesian deconvolution over
Wiener filtering. These include:
printed: 2/5/2016
10
1. Hyperparameters controlling the spectral density g  ( ) or, equivalently, serial covariances
 2  among the errors are estimated from the time-series being analyzed (these estimates are
restricted maximum likelihood estimates - ReML).
2. Hyperparameters controlling the prior variance components on the underlying signal are
estimated directly from the data (again using ReML estimators).
3. The estimation scheme is not constrained to stationary processes. For example, the Fourier
basis set can be replaced by suitable wavelets to finesse the time-frequency characterization
implicit in X, and the covariance matrices can have non-Toeplitz forms.
Software implementation note: Following the submission of this manuscript the
algorithms were instantiated in SPM2, which is due to be released in the next several months
(http://www.fil.ion.ucl.ac.uk/spm). In this instantiation, we have included three sets of
explanatory coefficients rather than just the cosine basis set alone (cf. the section on Examples,
pg 7). These coefficients include a cosine basis set, the full experimental design matrix, and the
filtering parameters used in the original analysis (e.g., high-pass, etc.), Figure 7. Each coefficient
set also has its own hyperparameter. The inclusion of all three estimators allows the empirical
Bayes procedure to adjust the final parameter estimates based on a more flexible approach to the
experimental variance. This is similar to the method described by Zarahn (2000), but does not
require manual specification of the individual basis set components (Zarahn 2000).
printed: 2/5/2016
11
ACKNOWLEDGMENTS
DRG was supported by a grant from the NIA (K23 00940-03). WBP, JA, and KJF were
supported by the Wellcome Trust. We would also like to thank two anonymous reviewers for
their helpful comments.
REFERENCES
Aguirre, G. K., Zarahn, E. and D'Esposito, M. (1998). "The variability of human BOLD
hemodynamic responses." NeuroImage 8: 360-369.
Büchel, C., Coull, J. T. and Friston, K. J. (1999). "The predictive value of changes in effective
connectivity for human learning." Science 283: 1538-1541.
Büchel, C. and Friston, K. J. (1997). "Modulation of connectivity in visual pathways by
attention: cortical interactions evaluated with structural equation modeling and fMRI."
Cerebral Cortex 7: 768-778.
Friston, K. J., Buechel, C., Fink, G. R., Morris, J., Rolls, E. and Dolan, R. J. (1997).
"Psychophysiological and modulatory interactions in neuroimaging." NeuroImage 6(3):
218-29.
Friston, K. J., Fletcher, P., Josephs, O., Holmes, A., Rugg, M. D. and Turner, R. (1998). "Eventrelated fMRI: characterizing differential responses." NeuroImage 7(1): 30-40.
Friston, K. J., Josephs, O., Rees, G. and Turner, R. (1998). "Nonlinear event-related responses in
fMRI." Magnetic Resonance in Medicine 39(1): 41-52.
Friston, K. J., Penny, W., Phillips, C., Kiebel, S., Hinton, G. and Ashburner, J. (2002). "Classical
and Bayesian inference in neuroimaging: Theory." NeuroImage 16(2): 465-483.
Glover, G. H. (1999). "Deconvolution of impulse response in event-related BOLD fMRI."
NeuroImage 9: 416-429.
McIntosh, A. R., Grady, C. L., Ungerleider, L. G., Haxby, J. V., Rapoport, S. I. and Horwitz, B.
(1994). "Network analysis of cortical visual pathways mapped with PET." Journal of
Neuroscience 14: 655-666.
Zarahn, E. (2000). "Testing for neural responses during temporal components of trials with
BOLD fMRI." NeuroImage 11: 783-796.
Zarahn, E., Aguirre, G. K. and D'Esposito, M. (1999). "Temporal isolation of the neural
correlates of spatial mnemonic processing with fMRI." Cognitive Brain Research 7: 255268.
printed: 2/5/2016
12
Figure Legends
Figure 1: Neural activity from twenty simulated neural events is displayed. The signal level in all
figures has been normalized and displayed in arbitrary units (au). The activity from area B (1B)
has been slightly and variably delayed in relation to the activity from area A (1A). Neural
activity was produced by randomly generating twenty delta functions and convolving them with
an exponential decay function. C and D illustrate simulated noiseless BOLD signal generated by
convolving the neural activity with a HRF.
Figure 2: A) Interaction term produced by multiplying BOLD signal A with BOLD signal B.
Note this is an interaction of hemodynamic activity, not neural activity. B) Interaction term
produced by deconvolving the BOLD signals A and B, multiplying the resulting estimated neural
activities and reconvolving with a HRF. Note the difference between the interactions.
Figure 3: The effect of noise on the deconvolution. A) Plot of original (dotted line) and
deconvolved (solid line) neural signal from noiseless simulated BOLD data. The effect of adding
Gaussian noise, 25% to 100% before deconvolution, is illustrated in B – D.
Figure 4: Hemodynamic interactions in the presence of 0% and 100% Gaussian noise are shown
in 4A and 4B. Deconvolved neuronal interactions in the setting of 0% and 100% Gaussian noise
is illustrated in 4C and 4D respectively
Figure 5: Psychophysiological interactions from an event-related fMRI data. A) event-related
BOLD signal (solid-black) and regenerated BOLD signal (dashed-red). The regenerated signal
was produced by deconvolving the BOLD signal using the techniques in this report, and then
reconvolving with a HRF in order to compare the result of the deconvolution procedure with the
original signal. B) Psychological variable; C) Hemodynamic interaction; D) Neuronal
interaction.
Figure 6: Psychophysiological interactions from a block-design experiment. A) block-design
BOLD signal (solid-black) and regenerated BOLD signal (dashed-red). The regenerated signal
was produced as in figure 5. B) psychological variable; C) Hemodynamic interaction; D)
Neuronal interaction.
printed: 2/5/2016
13
Figure 7: Example of explanatory coefficients used for the parametric empirical Bayes
deconvolution in SPM2. The experiment was organized as a block design alternating between a
rest and active condition 8 times, every 7 scans. The first 112 columns of the matrix is a cosine
basis set; the next 1 column is the design matrix; and the last 8 columns is a high-pass filter. In
the original implementation of the procedure (cf. section on Examples) only the cosine basis set
was used.
printed: 2/5/2016
14
Figure 1
printed: 2/5/2016
15
Figure 2
printed: 2/5/2016
16
Figure 3
printed: 2/5/2016
17
Figure 4
printed: 2/5/2016
18
Figure 5
printed: 2/5/2016
19
Figure 6
printed: 2/5/2016
20
Figure 7
printed: 2/5/2016
21
Download