See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/348755757 Mathematics for Machine Learning Presentation · January 2021 CITATIONS READS 0 663 Some of the authors of this publication are also working on these related projects: Mathematics, Machine Learning and Biology View project All content following this page was uploaded by Farid Saberi-Movahed on 25 January 2021. The user has requested enhancement of the downloaded file. Mathematics for Machine Learning Farid Saberi-Movahed Department of Applied Mathematics Graduate University of Advanced Technology January 23, 2021 Some important categories of mathematics for machine learning Farid Saberi-Movahed — Mathematics for Machine Learning 1 Contents 2 Eigenvalues and Eigenvectors Similarity and Distance Learning Graph-Based Methods Matrix Decompositions Matrix Norms Some Potential and Hot Topics Farid Saberi-Movahed — Mathematics for Machine Learning Eigenvalues and Eigenvectors 3 Farid Saberi-Movahed — Mathematics for Machine Learning Some applications 4 I Dimensionality Reduction I Spectral Clustering I Graph-Based Methods Farid Saberi-Movahed — Mathematics for Machine Learning Dimensionality Reduction 5 Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data. Farid Saberi-Movahed — Mathematics for Machine Learning Dimensionality Reduction 6 Farid Saberi-Movahed — Mathematics for Machine Learning Dimensionality Reduction 7 I Let us assume that the data matrix is defined as X = [x (1) ; x (2) ; . . . ; x (m) ], where each x (i) ∈ R1×n . I The aim of the dimensionality reduction is to find a low dimensional representation X̃ = [x̃ (1) ; x̃ (2) ; . . . ; x̃ (m) ]. I This process is achieved by a mapping ϕ : x (i) ∈ Rn −→ x̃ (i) ∈ Rk x̃ Farid Saberi-Movahed — Mathematics for Machine Learning (i) (i) = ϕ(x ) (k n) Principal Component Analysis (PCA) 8 Farid Saberi-Movahed — Mathematics for Machine Learning Spectral Clustering 9 Farid Saberi-Movahed — Mathematics for Machine Learning Laplacian Matrix 10 I In the graph theory, the Laplacian matrix (graph Laplacian) is a matrix representation of a graph. I Given a simple graph G with m vertices, its Laplacian matrix L ∈ Rm×m is defined as: L = D − A, where I I A = [aij ] ∈ Rm×m is the adjacency (similarity) matrix of the graph. D is the degree matrix and is defined as D = diag(d11 , d22 , . . . , dmm ), where dii = m X j=1 Farid Saberi-Movahed — Mathematics for Machine Learning aij . Spectral Clustering Algorithm 11 Farid Saberi-Movahed — Mathematics for Machine Learning A Good Reference to Start with Spectral Clustering A Tutorial on Spectral Clustering Ulrike von Luxburg Farid Saberi-Movahed — Mathematics for Machine Learning 12 Similarity and Distance Learning 13 I Clustering is one of the most fundamental unsupervised learning techniques. I The goal of clustering is to divide a set of data points into multiple clusters so that 1. points within a cluster have high similarity, 2. but are very dissimilar to points in other clusters. Farid Saberi-Movahed — Mathematics for Machine Learning Similarity and Distance Learning 14 Farid Saberi-Movahed — Mathematics for Machine Learning Similarity 15 Farid Saberi-Movahed — Mathematics for Machine Learning Distance Learning 16 I Euclidean distance: d(x, y ) = kx − y k2 . I Manhattan distance: d(x, y ) = kx − y k1 . I Chebyshev distance: d(x, y ) = kx − yk∞ . I Mahalanobis distance: dM (x, y )2 = (x − y)T M(x − y ), where M is a symmetric positive semidefinite matrix. Farid Saberi-Movahed — Mathematics for Machine Learning A Good Reference to Start with Distance Learning A Tutorial on Distance Metric Learning: Mathematical Foundations, Algorithms and Software Juan Luis Suarez Salvador Garcia Francisco Herrera Farid Saberi-Movahed — Mathematics for Machine Learning 17 Graph-Based Methods 18 Farid Saberi-Movahed — Mathematics for Machine Learning A General Framework of a Graph-Based Method I I I Conducting a nearest-neighbor search. Considering distance between points. Using the eigen-information for embedding high-dimensional points into a lower dimensional space. Farid Saberi-Movahed — Mathematics for Machine Learning 19 Some Important Examples of Graph-Based Methods I Laplacian Eigenmaps I Local Linear Embedding (LLE) I Isometric Mapping (ISOMAP) I Hessian Eigenmaps I Diffusion Maps Farid Saberi-Movahed — Mathematics for Machine Learning 20 Two Good References to Start with Manifold Learning I Review Paper: A. Izenman, Introduction to manifold learning, Wiley Interdisciplinary Reviews: Computational Statistics, 4(5):439-46, 2012. I Book (Section 6.7): S. Theodoridis, Ko. Koutroumbas, Pattern recognition, Academic Press, 2003. Farid Saberi-Movahed — Mathematics for Machine Learning 21 Matrix Decompositions 22 I Singular value decomposition (SVD): Given X ∈ Rn×d . SVD: X ≈ UΣV T . I Non-negative matrix factorization (NMF): Given X+ ∈ Rn×d . Find non-negative matrix factors U ∈ Rn×k and H ∈ Rk×d such that: X+ ' U H, s.t. U ≥ 0, and H ≥ 0. I Semi-NMF: X ≈ UH+ . I Convex-NMF: X ≈ XW+ H+ . Farid Saberi-Movahed — Mathematics for Machine Learning Non-Negative Matrix Factorization (NMF) 23 X ' U H, s.t. U ≥ 0, and H ≥ 0. In fact, each column vector of X can be presented as the linear combination of the column vectors in U using coefficients supplied by columns of H. That is, xi ' U hi . Farid Saberi-Movahed — Mathematics for Machine Learning A Good Reference to Start with Non-Negative Matrix Factorization I Review Paper: A.C. Turkmen, A review of nonnegative matrix factorization methods for clustering, 2015. Farid Saberi-Movahed — Mathematics for Machine Learning 24 Matrix Norms 25 For any arbitrary matrix A ∈ Rn×k , the Ll,p -norm is defined as follows: kAkl,p p p1 l n k X X = |ai,j |l . i=1 • k · k1 -norm: When l = p = 1, j=1 kAk1 = Pn Pk i=1 j=1 |ai,j |. P P n k 1 2 |ai,j |2 . 1 P Pk 2 2 • L2,1 -norm: When l = 2, p = 1, kAk2,1 = ni=1 . j=1 |ai,j | 1 2 Pn Pk 2 4 • L2,1/2 -norm: When l = 2, p = 1/2, kAk2,1/2 = . i=1 j=1 |ai,j | • Frobenius norm: When l = p = 2, Farid Saberi-Movahed — Mathematics for Machine Learning kAkF = i=1 j=1 Sparsity Learning 26 Farid Saberi-Movahed — Mathematics for Machine Learning 27 Some Potential and Hot Topics Farid Saberi-Movahed — Mathematics for Machine Learning Subspace Clustering 28 Subspace clustering is an extension of traditional clustering that seeks to find clusters in different subspaces within a dataset. I A. Aldroubi, A. Sekmen, Reduced row echelon form and non-linear approximation for subspace segmentation and high-dimensional data clustering. Applied and Computational Harmonic Analysis, 37(2), 271-287, 2014. Farid Saberi-Movahed — Mathematics for Machine Learning Self-Representation in Feature Selection 29 Let us assume that the feature representation of the data matrix is X = [x1 , x2 , . . . , xn ], where each xi ∈ Rm×1 . The main idea behind the self-representation is that each feature vector xi can be linearly represented by other features. x1 ≈ z11 x1 + · · · + zi1 xi + · · · + zn1 xn . . . xi ≈ z1i x1 + · · · + zii xi + · · · + zni xn . .. xn ≈ z1n x1 + · · · + zin xi + · · · + znn xn Farid Saberi-Movahed — Mathematics for Machine Learning When Harmonic Analysis Meets Machine Learning When Harmonic Analysis Meets Machine Learning: Lipschitz Analysis of Deep Convolution Networks Radu Balan Professor of Applied Mathematics Applied Harmonic Analysis University of Maryland Farid Saberi-Movahed — Mathematics for Machine Learning 30 Graph Embedding 31 Graph embedding is an approach that is used to transform nodes, edges, and their features into a lower dimension whilst maximally preserving properties like graph structure and information. Farid Saberi-Movahed — Mathematics for Machine Learning A very Good Reference to Start with Graph Neural Networks Review Paper: Z. Wu, et al., A comprehensive survey on graph neural networks, IEEE Transactions on Neural Networks and Learning Systems, 2020. Farid Saberi-Movahed — Mathematics for Machine Learning 32 33 Farid Saberi-Movahed — Mathematics for Machine Learning 34 Farid Saberi-Movahed — Mathematics for Machine Learning Thank you for your attention! View publication stats