Interactive Learning using Manifold Geometry Eric Eaton, Gary Holness, and Daniel McFarlane Lockheed Martin Advanced Technology Laboratories Artificial Intelligence Research Group This work was supported by internal funding from Lockheed Martin and the National Science Foundation under NSF ITR #0325329. Introduction: Motivation Information monitoring systems use a scoring function f to focus user attention – f is customized to the current situation – Often, no data are available to learn f – Users require fine control over the scoring function Maritime Situational Awareness We propose an interactive learning method that enables the user to iteratively refine f Network Security Monitoring Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 2 Introduction: Interactive Refinement Uses a combination of manual input and machine learning: 1. The user manually selects and repositions a data point 2. The system relearns the model f, and updates the scatterplot Key idea: each adjustment should generalize naturally to the model We use least squares with Laplacian regularization to learn f, based on the manifold underlying the data Model View Relevancy User View 1D Projection of Data Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 3 Related Work: Interactive Learning Crayons tool for interactive object classification (Fails & Olsen, 2003) Interactive decision tree construction (Ware et al., 2001) Interactive visual clustering (desJardins et al., 2008) Crayons by Fails & Olsen Feature selection (Figure used with permission) (Dy & Brodley, 2000) Hierarchical clustering (Wills, 1998) Initial view After 2 adjustments After 14 adjustments Interactive Visual Clustering by desJardins et al. (Figure used with permission) Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 4 Related Work: Interactive vs Active Learning Active learning – selects instances for labeling by an oracle (Cohn et al., 1996; McCallum & Nigam, 1998; Tong, 2001) Interactive ML Active Learning Starts with… Unlabeled data Incorrect model Unlabeled data No model Selection of instances User determines adjustments System selects instances for labeling Goal Collaborate with the user to define or adjust a model Minimize number of labels needed to learn a model Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 5 Mechanisms for User Interaction Data set where Relevancy The user supplies the initial scoring function Score: 55 Id: dmaskes2 Event: ACL-Monitor System: Julius-laptop ------------------------------Freq: 8 (1hr) 8 (24hr) ------------------------------DETAILS: UID: dmaskes2 Role: App_Update Policy: finCloseLock StartTime: 0 17 * * 5 EndTime: 0 8 * * 1 Res_type: triggerOverride View_type: AcctClerk DS_name: tbl_wklyTotals Error: unauth_update Value: (2 3 -2334 conf) – We used a linear function for Current scoring function is given by f (initially ) 1D Projection of Data User View The user adjusts the score of individual data points to change f until it matches the true (hidden) function F – Details of each instance are available in a side panel – User selects and drags an instance up or down to change its score Future work: similarity metric updates, qualitative feedback Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 6 Approach: Learning the Scoring Function Key Idea: each adjustment should generalize naturally to the model – Adjustments should affect similar instances – Generalizations should be based on the geometry underlying the data Our approach: – Construct the manifold underlying the data – Learn/update f using the manifold’s basis v13 v12 v4 v3 v10 v7 v8 v11 v14 v5 v6 v2 v15 v1 Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 7 Approach: Constructing the Manifold Represent data set X as an undirected graph G = (V,A), with vertex vi representing instance xi Adjacency matrix A is given by: – Weighting each edge (vi, vj) by a radial basis function of the distance – Connecting each instance to its k nearest neighbors G is a discrete approximation of the continuous manifold initial scoring function ? ? ? ? ? ? ? ? 0.4 ? ? ? 0.9 ? ? ? ? ? 0.8 ? ? ? ? ? ? ? Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry ? ? ? 8 Approach: Learning the Function on the Manifold Form the graph Laplacian of G (Chung, 1994) where Take the eigendecomposition of = Q Λ λ1 λ2 λ3 λn QT λ1 = 0 QThe = [q first eigenvector 1…q n] forms a complete orthonormal basis for G is constant q1 q2 q5 q10 q20 q50 Meshes provided by Gabriel Peyré Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 9 Approach: Learning the Function on the Manifold The scoring function f : V → [0,1] is given by f = QW Fit W by least squares with Laplacian regularization: – This is a special case of Belkin et al.’s (2006) Manifold Regularization – Eigenvalues ¤ increasingly regularize the higher-order components A slider bar controls the weight of adjustments Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 10 Complete Algorithm for Interactive Refinement Given: the data X, the user’s initial scoring function Set Construct the manifold underlying X, represented by G = (V,A) Compute the graph Laplacian Compute the eigenvectors Q and eigenvalues ¤ of Repeat of G – Display the scatterplot of X using the scores given by f – (Optional) The user adjusts the score of data instance xi – (Optional) The user updates the adjustment weight ! via a slider bar – If there were changes, update the scoring function as f = QW, where W is given by Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 12 Scaling to Large Volumes of Data A can be stored efficiently as a symmetric banded matrix is also a symmetric banded matrix – Use sparse eigensolvers (e.g., Lanczos methods) for efficiency Nyström method (Baker 1977) extends the eigenvectors to new vertices for inductive learning – Learn on a sample , with Laplacian – Extend eigenvectors to new instances by – Score for a new instance x (represented by vertex v) is then given by Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 13 Evaluation Simulate user by adjusting the current “most incorrect” instance to the correct score – Users are adept at identifying outliers, motivating our approach – is a linear model fit to X using ridge regression Compared against interactive learning using: – SMO support vector regression with an RBF kernel – Least squares regularized with a ridge parameter of 10E-8 Data Sets Name CPU Heart Disease Pharynx Pyrimidines Sleep Wisconsin Breast Cancer #Inst #Dim 209 6 303 13 195 10 74 27 62 7 194 32 Source UCI repository UCI repository Kalbfleisch & Prentice (1980) King et al. (1992) StatLib archive UCI repository Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 14 Evaluation: Adjusting the “most incorrect” instance Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 15 Evaluation: Adjusting a random instance (100 trials) Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 16 Related Work: Manifold Learning Belkin et al.’s (2006) Manifold Regularization – We use a special case regularizing only the solution’s smoothness Semi-supervised learning using Gaussian random fields (Zhu et al., 2003; Cai et al., 2006) Zhou et al.’s (2004; 2005) “Distribution Regularization” – Uses a regularized form of the graph Laplacian as the basis – Learns a function Spectral Graph Transduction (Joachims, 2003) Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 17 Conclusion and Future Work We presented a method for interactive learning based on least squares with Laplacian regularization Manifold-based interactive learning continuously improves with each correction In practice, the technique shows an interactive response time for hundreds of data instances Future Work: – User adjustment of the similarity metric between data instances – Incorporate passive observation of the user – Handling drifting or recurring concepts Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 18 Thank You! Questions? Eric Eaton eeaton@atl.lmco.com This work was supported by internal funding from Lockheed Martin and the National Science Foundation under NSF ITR #0325329. References Baker, C. T. H. 1977. The Numerical Treatment of Integral Equations. Oxford: Clarendon Press. Belkin, M.; Niyogi, P.; and Sindhwani, V. 2006. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Artificial Intelligence Research 7:2399-2434. Cai, D., He, X., and Han, J. 2007. Spectral regression: a unified subspace learning framework for content-based image retrieval. In Proceedings of the 15th International Conference on Multimedia, p. 403-412. ACM Press. Chung, F. R. K. 1994. Spectral Graph Theory. Number 92 in CBMS Regional Conference Series in Mathematics. Providence, RI: American Mathematical Society. Cohn, D. A.; Ghahramani, Z.; and Jordan, M. I. 1996. Active learning with statistical models. Journal of Artificial Intelligence Research 4:129-145. desJardins, M.; MacGlashan, J.; and Ferraioli, J. 2008. Interactive visual clustering for relational data. In Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall. 329-356. Dy, J. G., and Brodley, C. E. 2000. Visualization and interactive feature selection for unsupervised data. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 360-364. ACM Press. Fails, J. A., and Olsen, Jr., D. R. 2003. Interactive machine learning. In Proceedings of the Eighth International Conference on Intelligent User Interfaces, 39-45. Miami, FL: ACM Press. Joachims, T.: 2003. Transductive Learning via Spectral Graph Partitioning. In Proceedings of the International Conference on Machine Learning, p. 290-297. McCallum, A., and Nigam, K. 1998. Employing EM in poolbased active learning for text classification. In Proceedings of Fifteenth International Conference on Machine Learning, 359-367. San Francisco, CA: Morgan Kaufmann. Tong, S. 2001. Active Learning: Theory and Applications. Ph.D. Dissertation, Stanford University. Ware, M.; Frank, E.; Holmes, G.; Hall, M.; and Witten, I. H. 2001. Interactive machine learning: Letting users build classifiers. International Journal of Human Computer Studies 55(3):281-292. Wills, G. J. 1998. An interactive view for hierarchical clustering. In Proceedings of the 1998 IEEE Symposium on Information Visualization (INFOVIS), Washington, DC, USA: IEEE Computer Society. Zhou, D.; Huang, J.; and Scholkopf, B. 2005. Learning from labeled and unlabeled data on a directed graph. In Proceedings of the International Conference on Machine Learning, p. 1036-1043. Bonn, Germany: ACM Press. Eric Eaton, Gary Holness, & Daniel McFarlane - Interactive Learning using Manifold Geometry 20