Functional Brain Signal Processing: EEG & fMRI Lesson 2 Kaushik Majumdar

advertisement
M.Tech. (CS), Semester III, Course B50
Functional Brain Signal
Processing: EEG & fMRI
Lesson 2
Kaushik Majumdar
Indian Statistical Institute
Bangalore Center
kmajumdar@isibang.ac.in
EEG Processing

Preprocessing

Pattern recognition
Benbadis and Rielo, 2008: http://emedicine.medscape.com/article/1140247-overview
EEG Artifacts
Benbadis and Rielo, 2008: http://emedicine.medscape.com/article/1140247-overview
Eye Blink Artifact:
Electrooculogram (EOG)
Matrix Representation of MultiChannel EEG

M is an m x n matrix, whose m rows
represent m EEG channels and n columns
represent n time points.

Often during EEG processing we are to find a
matrix W such that WM is the processed
signal.
Majumdar, under preparation, 2013
EOG Identification by Principal
Component Analysis (PCA)
PCA Algorithm (cont.)
PCA Algorithm (cont.)
PCA
Rotation and (Stretching or Contracting)
Wallstrom et al., Int. J. Psychophysiol., 53: 105-119, 2004
Performance of PCA in EOG
Removal
EOG
Independent Component Analysis
(ICA)

In PCA data components are assumed to be
mutually orthogonal, which is too restrictive.
PCA components
Original data
sets
ICA (cont.)

PCA will give poor results if the covariance
matrix has eigenvalues close to each other.
ICA as Blind Source Separation
(BSS)
S1
2
S2
1
S4
Four musicians are playing in a room.
From the outside only music can be heard
through four microphones.
4
No one can be seen.
How the music heard from outside can be
decomposed into four sources?
S3
3
Mathematical Formulation
A is mixing matrix, x is sensor vector, s is
source vector and n is noise, which is to
be eliminated by filtering.
Mathematical Formulation (cont.)
Given
find
such that
Any estimation technique of
is called an
ICA technique or BSS technique in general.
Hyvarinen and Oja, Neural Networks, 13: 411-430, 2000
ICA Algorithm: FastICA
Whitening:

Normalization (make mean zero).
Make variance one i.e.,
E expectation, x is the vector of signals and I is
identity matrix.

FastICA (cont.)
B is orthogonal matrix and D is
diagonal matrix of E
will satisfy
Whitening complete
Non-Gaussianity


ICA is appropriate only when probability
distribution of the data set is non-Gaussian.
Gaussian distribution is of the form
Entropy of Gaussian Variable

A Gaussian variable has the largest entropy
among a class of random variables with
equal variance (for a proof see Cover &
Thomas, Elements of Information Theory).
Here we will give an intuitive argument.
Entropy of a Random Variable X

En( X )    p( X ) log p( X )dX

Less (zero) information
More information
Random
1
0.8
0.9
0.6
0.8
0.4
0.7
0.2
0.6
X = random(t)
X = sin(10t)
Deterministic
1
0
-0.2
0.5
0.4
-0.4
0.3
-0.6
0.2
-0.8
0.1
-1
0
0
1
2
3
4
t
5
6
7
0
100
200
300
400
t
500
600
700
Gaussian Random Variable Has
Highest Entropy: Intuitive Proof


By Central Limit Theorem (CLT) the mean of
a class of random variables (class is signified
by uniform variance) follows normal
distribution as the number of members in the
class tends to infinity (i.e., becomes very
large).
Infinite observations hold infinite or maximum
amount of information.
Intuitive Proof (cont.)


Therefore a random variable with normal
distribution has the highest information
content.
So it has the highest entropy.
If each variable in a class of random variables
admit only finite number of nonzero values,
the one with uniform distribution will have the
highest entropy.
Non-Gaussianity as Negentropy
H is entropy and J negentropy. J is to be maximized.
When J is maximum y is reduced to a component. This
can be shown by calculating the kurtosis  2 for
component and sum of components including the said
component (See Hyvarinen & Oja, 2000, P. 7).
Steps of FastICA after Whitening
g is in the form of
either of the two
Exercise

FastICA has been implemented in EEGLAB
(in runica function). Remove artifacts from
sample EEG data using the ICA
implementation in EEGLAB.
Concept of Independence in PCA
and ICA


In PCA independence means orthogonality
i.e., pairwise dot product is zero.
In ICA independence is statistical
independence. Let x, y be random variables,
p(x) is probability distribution function of x
and p(x,y) is joint probability distribution
function of (x,y). If p(x,y) = p(x).p(y) holds we
call x and y are statistically independent.
Independence (cont.)



If vectors v1 and v2 are orthogonal they are
independent. Say not, then a1v1 + a2v2 = 0
implies, a1v1.v1 + a2v2.v1 = 0 or a1 = 0.
Similarly a2 = 0.
If v1 = cv2 then both of them must have same
probability distribution or p(v1,v2) = p(v1) =
p(v2). If v1 and v2 are linearly independent
p(v1,v2) = p(v1).p(v2) may or may not hold.
If p(v1,v2) = p(v1).p(v2) holds then v1 and v2
are linearly independent.
Conditions for ICA Applicability



Sources are statistically independent.
Propagation delays in the mixing medium are
negligible. Sources are time varying. Mixing
medium delays may affect sources in
different locations differently and thereby
corrupting their temporal structures.
Number of sources = number of sensors.
References



Benbadis and Rielo, EEG artifacts,
eMedicine, available online at
http://emedicine.medscape.com/article/1140
247-overview, 2008.
Hyvarinen and Oja, Independent component
analysis: algorithms and applications, Neural
Networks, vol. 13, p. 411-431, 2000.
Majumdar, A Brief Survey of Quantitative
EEG Analysis, Chapter 2.
THANK YOU
This lecture is available at http://www.isibang.ac.in/~kaushik
Download