Independent Component Analysis and Exploratory Projection Pursuit “The Elements of Statistical Learning”

UNIVERSITETET I OSLO Independent Component Analysis and Exploratory Projection Pursuit pp. 494-502 in the textbook “The Elements of Statistical Learning” by T. Hastie, R. Tibshirani, J. Friedman INSTITUTT FOR INFORMATIKK 1 UNIVERSITETET I OSLO Factor Analysis Factor analysis model: x = As + ε • x is a vector of p observed and typically correlated variables • s is a vector of q < p uncorrelated latent variables (common factors) • A is a constant p × q matrix of factor loadings • ε is a vector of uncorrelated zero mean disturbances • Typically s and ε are modeled as Gaussian random variables INSTITUTT FOR INFORMATIKK 2 UNIVERSITETET I OSLO Factor Analysis (cont’d) Goal of factor analysis: to estimate A from the covariance matrix of the data Σ = AAT + Dε Problem with factor analysis: A can only be determined up to a rotation (AR)(AR)T + Dε = AAT + Dε INSTITUTT FOR INFORMATIKK for an orthogonal matrix R 3 UNIVERSITETET I OSLO Independent Component Analysis (ICA) ICA model: x1 = a11 s1 + · · · + a1p sp x2 = a21 s1 + · · · + a2p sp ... xp = ap1s1 + · · · + app sp • x is a vector of observed variables              x = As • s vector of independent and non-Gaussian latent variables (independent components) • A is a constant mixing matrix Goal of ICA: to recover A INSTITUTT FOR INFORMATIKK 4 UNIVERSITETET I OSLO ICA (cont’d) Further assumptions: • x is zero mean and white, e.g. E(xxT ) = I (can always be achieved by centering and PCA/SVD) • A is orthogonal =⇒ s = AT x Definition: the differential entropy H of random vector y with density g(y) is given by Z H(y) = − g(y) log g(y) dy Fact: H(y) measures the information content of y INSTITUTT FOR INFORMATIKK 5 UNIVERSITETET I OSLO ICA (cont’d) Definition: the mutual information I(y) between the components of a random vector y is given by I(y) = p X j=1 H(yj ) − H(y) Fact: I(y) is always non-negative and measures the degree of dependence between the components of y. If the components of y are independent then I(y) = 0 =⇒ I(s) = I(AT x) = 0 Idea: look for A such that I(AT x) is minimized INSTITUTT FOR INFORMATIKK 6 UNIVERSITETET I OSLO ICA (cont’d) Fact: among all random variables with equal variance, Gaussian variables have maximum entropy Fact: If A is orthogonal then T I(A x) = p X j=1 T H (A x j ) − H(x) =⇒ to minimize I(AT x) w.r.t A is equivalent to minimize the the entropy of the components of I(AT x), which in turn amounts to maximizing their departures from Gaussianity INSTITUTT FOR INFORMATIKK 7 UNIVERSITETET I OSLO ICA (cont’d) Possible measure of departure of from Gaussanity for random variable y is negentropy J(y) defined by J(y) = H(z) − H(y), where z is a Gaussian random variable with same variance as y Estimation of negentropy is difficult, in practice approximations have to be used INSTITUTT FOR INFORMATIKK 8 UNIVERSITETET I OSLO Exploratory Projection Pursuit • Technique developed for visualizing high-dimensional data by finding “interesting projections” • Interesting structures such as clusters or long tails are revealed by non-Gaussian projections INSTITUTT FOR INFORMATIKK 9

Independent Component Analysis and Exploratory Projection Pursuit “The Elements of Statistical Learning”

Related documents

Products

Support

Independent Component Analysis and Exploratory Projection Pursuit “The Elements of Statistical Learning”

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib