The General Linear Model

advertisement
The General Linear Model
Or, What the Hell’s Going on During Estimation?
What we hope to cover:
• Extension of linear to multiple regression
• Matrix formulation of multiple regression; residuals and
parameter estimates
• General and Generalised Linear Models
• Overdetermined models and the pseudoinverse solution
• Specific application to fMRI and basis sets
Multiple Regression
Last time, David talked about linear regression – that is determination of a linear
relationship between a single dependent and a single independent variable, of the form:
Y = βX + c
For example, we might think that the number of papers a researcher publishes a year (Y)
is related to how hard working he/she is (X) and we can attempt to determine the
regression coefficient (β) which reflects how much of an effect X has on Y.
This approach can be extended to account for multiple variables, such as how friendly
you were to potential reviewers at a recent conference, and combined in a linear fashion:
Y = β1x1 + β2x2 …… βLxL +
ε
(1)
Multiple Regression
The β parameters reflect the independent contribution of each explanatory variable to Y,
that is the amount of variance accounted for by that variable after all the other variables
have been accounted for.
For example – one might see a negative correlation between height and hair length.
However, if we add an explanatory variable reflecting gender (a categorical or dummy
variable) then we see that the apparent correlation above actually reflects that, on
average, men are taller than women, whilst women tend to have longer hair, and that
height has no independent predictive value for hair length.
The regression surface (the equivalent of the slope line in simple regression) expresses
the best prediction of the dependent variable, Y, given the explanatory variables (Xs).
However, observed data will deviate from this regression surface, the deviation from the
corresponding point being termed the residual.
Matrix Formulation of Multiple Regression
Writing out equation (1) for each observation of Y gives a series of simultaneous
equations:
Y1 = x1 1 β1 + . . . + x1 l βl + . . . + x1 L + ε1
: = :
Yj = xj1 β1 + . . . + xj l βl + . . . + xj L + εj
: = :
YJ = xJ1 β1 + . . . + xJ l βl + . . . + xJ L + εJ
In Matrix Form:
…
…
…
…
…
Y1
x11
:
:
Yj = xj 1
:
:
YJ
xJ 1
Y =
Observed data
X
x1 l
:
xj l
:
xJ l
…
…
…
…
…
×
Design Matrix
x1 L
:
xj L
:
xJ L
1
:
l
:
L
1
:
+ j
:
J

+ 
Parameters
Residuals
Parameter Estimation
Typically the simultaneous equations shown before cannot be fully solved (i.e. with ε = 0),
so we aim to achieve the best between model and data, by minimising the sum of squares
of the residuals – this is the least squares estimate:
Residual sum of squares
Minimised when
which is the lth row of
so the least squares estimates satisfy the normal equations
giving
(2)
Extension to General and Generalised Linear Models
Multiple Regression (as with many parametric tests, including t- and F-tests, ANOVAs,
ANCOVAs etc.) is basically a limited form of a generalised linear model (GLM), with
certain constraints, particularly:-
• Only 1 dependent variable can be analysed
• It assumes that errors are independently, identically and normally distributed, with mean
0 and variance σ 2 (shown as
~ iid Ν (0, σ 2))
Extension to General and Generalised Linear Models
The General Linear Model allows linear combinations of multiple dependent variables
(multivariate statistics), replacing the Y vector of J observations of a single Y variable
with a matrix of J observations of N different variables – similarly the β vector is replaced
with a JxN matrix. However, whilst a fMRI experiment could be modelled with a Y matrix
reflecting BOLD signal at N voxels over J scans, SPM takes a mass univariate approach –
that is each voxel is represented by a column vector of observations over scans and
processed through the same model.
Generalised Linear Models (GLMs) do not assume spherical error distributions, and hence
can be utilised in order to correct for temporal correlations (this will be covered in a later
talk).
Overdetermined Models and Pseudoinversion
If the design matrix (X) has columns which are not linearly independent then it is
rank deficient and XTX has no inverse.
In this case there are an infinite number of parameter estimates which can describe
this model, with an infinite number of least square estimates which satisfy (2) – such
a model is said to be overdetermined.
Since we hope for a single set of parameters in order to construct our significance
tests a constraint must be applied to the estimates – the key point being then that
inference can only be meaningfully engaged in when considering functions of those
parameters not influenced by the chosen constraint.
SPM uses a pseudoinverse solution, and the pseudoinverse (XTX)- can be substituted
for (XTX)-1 in eq. (2)
GLM and fMRI Models
We have looked so far at multiple regression and the general linear model in a fairly
abstract context. We shall now think about how it applies to fMRI experiments:
Y
Observed data –
SPM uses a mass
univariate approach
– that is each voxel
is treated as a
separate column
vector of data.
=
X
Design matrix – formed of
several components which
explain the observed data:
Timing information
consisting of onset vectors
Omj and duration vectors Dm
Impulse response function
hm describing the shape of
expected BOLD response
Other regressors e.g.
movement parameters
.
β
+
Parameters defining the
contribution of each
component of the
design matrix to the
model.
These are estimated so
as to minimise the
error, and are used to
generate the contrasts
between conditions
(next week).
ε
Error - the
difference
between the
observed
data and the
model
defined by
Xβ. In fMRI
these are not
assumed to
be spherical
(temporal
correlations).
GLM and fMRI Models
The design of the experiment is principally defined by :
The stimulus function Sm, representing occurrence of a stimulus type in each of a series of
contiguous time bins for each trial type m. This is generated by SPM 2 from the onset
vector Omj and the duration vector, Dm.
The impulse response function, hm for trial type m.
The observed data, Y, is then expressed as:
Y = ( Σ hm conv Sm) + ε
The impulse response functions are not known, but SPM assumes that they can be
modelled as linear combinations of basis functions bi such that:
hmi = bi . βmi
A typical basis function set might comprise the haemodynamic response function (HRF)
and its partial derivatives with respect to time and dispersion.
GLM and fMRI Models
How does this look with data?
Observed data
Model (green and red)
and true signal (blue)
Error + noise – set parameters to
minimise this
Summary
• The general linear model is a powerful statistical tool allowing determination of
multiple parameters predicting multiple dependent variables. Many other parametric
tests are special cases of the general linear model (t-tests, ANOVAs, F-test, regression)
• The design matrix contains the information about the designed aspects of the
experiment which may explain the observed data.
•Minimising the sum of square differences between the modelled and observed data
allows determination of the optimal parameters for the model.
• The parameters can then be utilised to construct F- and t-tests to determine the
significance of contrasts between experimental factors (more next week).
• In fMRI we convolve the information about impulse response functions and the timing
of different trial types to give the design matrix. We must also utilise a Generalised
Linear Model to allow correction for temporal correlations over scans (more in a few
weeks).
Download