Multivariate distance and similarity - Bob Murphy

Multivariate Distance and Similarity Robert F. Murphy Cytometry Development Workshop 2000 General Multivariate Dataset  We are given values of p variables for n independent observations  Construct an n x p matrix M consisting of vectors X1 through Xn each of length p Multivariate Sample Mean  Define mean vector I of length p n I( j)  n  M(i, j) or i1 n matrix notation I  Xi i1 n vector notation Multivariate Variance  Define variance vector s2 of length p 2 n s ( j)  2  M(i, j)  I( j) i1 n 1 matrix notation Multivariate Variance  or 2 n s  2  X i  I i1 n 1 vector notation Covariance Matrix  Define a p x p matrix cov (called the covariance matrix) analogous to s2 n cov( j,k)   M(i, j)  I( j )M(i,k)  I(k) i1 n 1 Covariance Matrix  Note that the covariance of a variable with itself is simply the variance of that variable cov( j, j)  s ( j) 2 Univariate Distance  The simple distance between the values of a single variable j for two observations i and l is M(i, j)  M(l, j) Univariate z-score Distance  To measure distance in units of standard deviation between the values of a single variable j for two observations i and l we define the z-score distance M(i, j)  M(l, j) s ( j) Bivariate Euclidean Distance  The most commonly used measure of distance between two observations i and l on two variables j and k is the Euclidean distance M(i, j)  M(l, j)  M(i,k)  M(l,k ) 2 2 Multivariate Euclidean Distance  This can be extended to more than two variables p  M(i, j)  M(l, j) j1 2 Effects of variance and covariance on Euclidean distance B A The ellipse shows the 50% contour of a hypothetical population. Points A and B have similar Euclidean distances from the mean, but point B is clearly “more different” from the population than point A. Mahalanobis Distance  To account for differences in variance between the variables, and to account for correlations between variables, we use the Mahalanobis distance D  X i  X l cov X i  Xl  2 -1 T

Multivariate distance and similarity - Bob Murphy

Related documents

Products

Support

Multivariate distance and similarity - Bob Murphy

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib