Kenneth D. Harris 24/6/15
Exploratory vs. confirmatory analysis
• Exploratory analysis
• Helps you formulate a hypothesis
• End result is usually a nice-looking picture
• Any method is equally valid – because it just helps you think of a hypothesis
• Confirmatory analysis
• Where you test your hypothesis
• Multiple ways to do it (Classical, Bayesian, Cross-validation)
• You have to stick to the rules
• Inductive vs. deductive reasoning (K. Popper)
Principal component analysis
• Finds directions of maximum variance in a data set
• These correspond to the eigenvectors of the covariance matrix
Relation to SVD
• Remember that SVD expressed any matrix 𝐌 as
𝐌 = 𝐔𝐒𝐕 𝑇
• The columns of 𝐔 are eigenvectors of 𝐌𝐌 eigenvectors of 𝐌 𝑇 𝐌 .
𝑇
, and the columns of 𝐕 are
𝐌𝐌
𝐌 𝑇
𝑇 𝐮 𝑖
𝐌𝐯 𝑖
= 𝑠
= 𝑠 𝑖
2 𝑖
2 𝐮 𝑖 𝐯 𝑖
Relation to SVD
• 𝐗 is a 𝑁 × 𝑝 dimensional data matrix, each row is a 𝑝 –dimensional vector.
𝐶𝑜𝑣 𝐱 =
1
𝑁 𝑖 𝐱 𝑖
𝑇 𝐱 𝑖
𝑇 𝐌
Where 𝑀 𝑖𝑗
= 𝑋 𝑖𝑗
− 𝑥 𝑗
: the mean row is subtracted.
PCA is equivalent to SVD of the mean-subtracted data matrix.
You never need to compute the covariance matrix!
PCA: auditory cortex population vectors
Bartho et al, EJN 2009
Discriminant analysis
• Observations are grouped in 𝐶 classes
• Find a 𝐶 − 1 dimensional projection that maximizes distances between class means, scaled by variances
• Between class covariance
𝐶
1
Σ 𝑏
= 𝛍 − 𝛍 𝛍 𝑐
𝐶 𝑐 𝑐=1
• Within class covariance
𝑁
1
Σ 𝑤
=
𝑁 𝐱 − 𝛍 𝑐 𝑖 𝐱 𝑖 𝑖=1 𝑖
Find top eigenvalues of Σ −1 𝑤
Σ 𝑏
− 𝛍
− 𝛍 𝑐 𝑖
𝑇
𝑇
Discriminant analysis: auditory cortex
• Projections chosen to maximally separate sustained responses
• Looks completely different to PCA!
Bartho et al, EJN 2009
Factor analysis
• Model: hidden factors plus independent noise: 𝐱 𝑖
= 𝚲𝐟 𝑖
+ 𝐔 𝑖
+ 𝛍 𝐟 𝑖 are the hidden factors on trial 𝑖 . Number of factors
𝚲 is 𝑝 × 𝑚 matrix of “factor loadings” 𝑚 < 𝑝 chosen by user.
𝐔 𝑖 is “noise” on trial 𝛍 is mean.
𝑖 . All elements are independent.
Covariance matrix 𝐶𝑜𝑣 𝐱 = 𝚲𝚲 𝑇 + 𝚿 , with 𝚿 diagonal.
Canonical correlation analysis
• Given two variables 𝐱 and 𝐲 , choose projections 𝐰 and 𝐯 to maximize correlation of 𝐱 ⋅ 𝐰 and 𝐲 ⋅ 𝐯 .
• For example these could be two neural populations.
Independent component analysis
PCA produces uncorrelated components. Uncorrelated ≠ independent!
ICA finds a decomposition 𝐱 𝑖
= 𝐀𝐬 𝑖
Where the components of 𝐬 𝒊 are maximally independent of each other.
Essentially equivalent to being maximally non-Gaussian.
These components might correspond to biological/physical generators
ICA in practice
• Fast ICA algorithm
• Need to chose a measure of non-Gaussianity
• Do an SVD first!
• It will take less time
• It will give better results
Wide-field movie: SVD 1
Wide-field movie: SVD 2
Wide-field movie: SVD 3
IC 1 (from 12 SVDs)
IC 2 (from 12 SVDs)
IC 3 (from 12 SVDs)
IC 4 (from 12 SVDs)
IC 1 (from 100 SVDs)
IC 2 (from 100 SVDs)
Non-negative matrix factorization
• Given an 𝑁 × 𝑝 matrix 𝐌, find 𝐖 and 𝐇 that minimize
𝐌 − 𝐖𝐇 𝟐 = 𝑖𝑗
𝑀 𝑖𝑗
− 𝑊 𝑖𝑘
𝐻 𝑘𝑗 𝑘
𝟐
Where 𝐖 is 𝑁 × 𝑚 and 𝐇 is 𝑚 × 𝑝 , and all elements are non-negative.
Non-negative factor 1
Non-negative factor 2
Non-negative factor 3
Non-negative factor 4
Non-negative factor 5
Non-negative factor 6
Non-negative factor 7
Many more methods… jPCA: Churchland, Cunningham et al, Nature 2012
Mante, Sussillo et al, Nature 2013
Summary
• There are lots of methods for doing dimensionality reduction
• THEY ARE EXPLORATORY ANALYSES
• Different methods will often give you different results
• Use them – they might help you formulate a hypothesis
• Then do a confirmatory analysis. These usually do not use dimensionality reduction.