Manifold Learning and Its Applications: Papers from the AAAI Fall Symposium (FS-10-06) Invited Talk Abstracts Yi Ma, Fei Sha, Lawrence Carin, Gilad Lerman, and Neil Lawrence TILT and RASL: For Low-Rank Structures in Images and Videos than state-of-the-art methods, especially when embedding data in spaces of very few dimensions. Yi Ma Hierarchical Bayesian Embeddings for Analysis and Synthesis of Dynamic Data In this talk, we will introduce two fundamental computational tools, namely TILT and RASL, for extracting rich low-rank structures in images and videos, respectively. Both tools utilize the same transformed Robust PCA model for the visual data: D = A + E, and use practically the same algorithm for extracting the low-rank structures A from the visual data D, despite image domain transformation T and corruptions E. We will show how these two seemingly simple tools can help unleash tremendous information in images and videos that we used to struggle to get. We believe these new tools will bring disruptive changes to many challenging tasks in computer vision and image processing, including feature extraction, image correspondence or alignment, 3D reconstruction, and object recognition, etc. This is joint work with John Wright of MSRA, Emmanuel Candes of Stanford, and my students Zhengdong Zhang, Xiao Liang, Yigang Peng of Tsinghua, Arvind Ganesh of UIUC. Lawrence Carin Hierarchical Bayesian methods are employed to learn a reversible statistical embedding. The proposed embedding procedure is connected to spectral embedding methods (e.g., diffusion maps and Isomap), yielding a new statistical spectral framework. The proposed approach allows one to discard the training data when embedding new data, allows synthesis of high-dimensional data from the embedding space, and provides accurate estimation of the latent-space dimensionality. Hierarchical Bayesian methods are also developed to learn a nonlinear dynamic model in the lowdimensional embedding space, allowing joint analysis of multiple types of dynamic data, sharing strength and inferring inter-relationships. In addition to analyzing dynamic data, the learned model also yields effective synthesis. Example results are presented for statistical embedding, latentspace dimensionality estimation, and analysis and synthesis of high-dimensional (dynamic) motion-capture data. Unsupervised Kernel Dimension Reduction Fei Sha Multi-Manifold Data Modeling: Foundations and Applications We apply the framework of kernel dimension reduction, originally designed for supervised problems to unsupervised dimensionality reduction. In this framework, kernelbased measures of independence are used to derive lowdimensional representation that maximally captures information in covariates in order to predict responses. We extend this idea and develop similarly motivated measures for unsupervised problems where covariates and responses are the same. Our empirical studies show that the resulting compact representation that optimizes the measure yields meaningful and appealing visualization and clustering of data. Furthermore, when used in conjunction with supervised learners for classification, our methods lead to lower classification errors Gilad Lerman We present several methods for multi-manifold data modeling, i.e., modeling data by mixtures of possibly intersecting manifolds. We focus on algorithms for the special case where the underlying manifolds are affine or linear subspaces. We emphasize various theoretical results supporting the performance of some of these algorithms, in particular their robustness to noise and outliers. We demonstrate how such theoretical insights guide us in practical choices and present applications of such algorithms. This is part of joint works with E. Arias-Castro, S. Atev, G. Chen, A. Szlam, Y. Wang, T. Whitehouse and T. Zhang ix A Probabilistic Perspective on Spectral Dimensionality Reduction Neil Lawrence Spectral approaches to dimensionality reduction typically reduce the dimensionality of a data set through taking the eigenvectors of a Laplacian or a similarity matrix. Classical multidimensional scaling also makes use of the eigenvectors of a similarity matrix. In this talk we introduce a maximum entropy approach to designing this similarity matrix. The approach is closely related to maximum variance unfolding and other spectral approaches such as locally linear embeddings and Laplacian eigenmaps also turn out to be closely related. Each method can be seen as a sparse Gaussian graphical model where correlations between data points (rather than across data features) are specified in the graph. This also suggests optimization via sparse inverse covariance techniques such as the graphical LASSO. Our hope is that this unifying perspective will allow the relationships between these methods to be better understood and will also provide the groundwork for further research. x