Student Visit Day AI, Vision, Graphics 24 March 2003 (Primary) Faculty Claire Cardie – natural language, learning Rich Caruana – learning Joe Halpern – reasoning under uncertainty, agents Thorsten Joachims – learning, text Lillian Lee – natural language Bart Selman – reasoning, hard problems (SAT) Dan Huttenlocher – recognition, tracking Ramin Zabih – stereo, medical imaging Steve Marschner – surface reflectance Kavita Bala – rendering, image-based rendering Don Greenberg – rendering, modeling 2 Faculty With Related Interests Bob Constable – automated reasoning Shimon Edelman – computational vision Eric Friedman – game theory, agents Johannes Gehrke – data mining, clustering Carla Gomes – reasoning John Hopcroft – clustering Jon Kleinberg – clustering, agents Hod Lipson – neural nets Mats Rooth – computational linguistics Eva Tardos – game theory, agents Ken Torrance - reflectance Golan Yona – learning applied to comp bio 3 Activities/Centers in the Areas Weekly seminars – AI – NLP – Graphics – Information Science – Bio-informatics Intelligent Information Systems Institute (IISI) Cognitive Studies Program Program of Computer Graphics 4 Research in AI Clustering – Cardie, Caruana, Hopcroft, Kleinberg, Lee, Selman, Yona Learning – Cardie, Caruana, Joachims, Lee, Yona Natural Language Processing (NLP) – Cardie, Lee, Joachims Multi-Agent Systems – Halpern, Selman Reasoning Under Uncertainty – Halpern, Huttenlocher, Zabih Knowledge Representation and Reasoning – Gomes, Selman 5 Clustering Large linked networks – Study of evolving structure / communities in very large linked networks – CiteSeer project (network of approx. 2 million papers; 5 million citations; 400,000 scientists) Agglomerative clustering – Stable clusters reflecting “communities” • E.g., stability over time – Networks such as CiteSeer citations exhibit stable clusters not found in random graphs 6 7 Is There a Unified Foundation for Clustering? Why so many ways to formalize the clustering problem? – (Agglomerative, centroid-based, information-theoretic, spectral, probabilistic generative models, discrete optimization models ... ) To get some insight : an impossibility result. – Informally, no clustering function that simultaneously: (1) is scale-invariant, (2) achieves any partition as a possible output, and (3) is “consistent” under stretching and shrinking of distances. Upshot: inherent trade-offs in the definition of the clustering problem. 8 Learning Ranking Functions Classification Functions: – “A is good”, “B is bad” – Often not appropriate Ranking Functions: – “A is better than B” – Information Retrieval – Protein Folding (Ron Elber) 282,000 hits Example: Improve Search Engine using Machine Learning 9 Support Vector Machine Example: Text Classification – SVMs are state-of-the-art – Many features (words) – Few training examples Statistical Learning Theory – Ability to learn doesn’t necessarily depend on number of features Kernels – Make classifiers non-linear – Learning on gene sequences, parse trees, graphs, etc. Efficient Training – SVM-light – 30,000,000 examples – 500,000 features (sparse) aardvark 0 … AOL buys Time Warner announced Officials announced today that they finalized the negotiations about the merger of America Online and Time Warner. The … 1 … time 2 toad 0 toll 0 … warner 2 zztop 0 d d d 10 • Learning-to-learn, Inductive Transfer, Multitask Learning • Machine Learning for Medical Decision Making • learning rankings • modeling standard practice • C-section prediction Support Vector Machines Neural Nets • Extreme Ensemble Learning • new approach • based on student project • best algorithm on planet? K-Nearest Neighbor Decision Trees 1000’s of models! Selected Ensemble 11 Natural Language Processing Goal: get computers to use human languages (e.g., English) We’ve developed state-of-the-art approaches in many end-to-end NLP applications including information extraction, multi-document summarization, factual question answering, opinion-based question answering, ... 12 Research Theme: Weakly-Supervised Learning Automatic extraction of useful information from language samples using minimal knowledge resources. Enhances efficient portability to different domains (languages, subfields, ...). Are “apple” and “sun” similar? Weakly-supervised learning: Leverage large unlabeled datasets, given a small amount of labeled data (and human patience) 13 Mostly-Unsupervised Learning Japanese, Chinese, Thai, ...: no spaces between words theyouthevent Combining simple statistics from unsegmented Japanese newswire yields results rivaling grammar-based approaches. [Ando/Lee 2000, 2003] 14 Active Learning Problem: partial parsing — find grammatical units I saw [her duck] [with a telescope]. Active learning: a machine learning technique selectively chooses examples to ask a human to label with the right answers Active learning can outperform human labeling of the entire [Pierce/Cardie 2003]. full dataset 15 Transduction Given: – Little training set – Large test set How can we minimize the number of errors on this test set? Text Classification physics Transductive SVM SVM class nuclear atom pepper D1 + 1 D2 ? 1 D3 ? D4 ? 1 D5 ? 1 D6 - basil salt and 1 1 1 1 1 1 1 1 1 1 1 1 1 16 Multi-Agent Systems: Reasoning About Knowledge Modeling statements like “I know that you know that I know that you know . . . ” – Basic ideas go back to philosophers in 1950s – Applications arise in distributed systems and game theory • Common knowledge necessary for coordination – Current application: reasoning about security – Getting good models of resource-bounded reasoning is key – Bringing in probabilistic reasoning is also key 17 Games and Mechanisms in Multi-Agent Systems Networks of Strategic Agents – Take into account how link structure of a network affects interactions • Can be physical, social, financial networks – Spread of influence; spread of viruses – One research thrust: trading in illiquid markets • Effect of broker-client relationships Trading and Combinatorial Auctions – Using combinatorial search algorithms to “solve” auctions • Yannis Vetsikas (Ph.D. student) won first place in annual Trading Agent Competition 18 Causality and Explanation What does it mean that – Event A causes event B? Event A is an explanation of B? Precise, useful definitions notoriously difficult – Philosphers working on it for millenia – Halpern and Pearl have given a formal definition, using Pearl’s model of structural equations. Current work: Extending definitions to notions of responsibility and blame – Suppose a firing squad consisting of 10 excellent marksmen shoots a prisoner – Only one has a live bullet, but none of them know which one has it – The marksman with the live bullet is responsible for the death, each marksman has degree of blame 1/10. 19 Reasoning About Uncertainty Modeling likelihood using probability and nonprobabilistic approaches – Plausibility measures, generalizes probability and all other standard models of uncertainty – Using formal decision theory to model decision making in systems. Markov models with hidden state – HMM’s – NDFA where each state has distribution over labels rather than single label • Efficient new algorithms for large state spaces • Similar techniques used in vision (MRF’s) 20 Large Scale Reasoning Engines Our inference methods can handle problems with over one million variables and five million constraints. – Previously only hundreds of constraints Novel applications: 1) Planning 2) Automated design and verification 3) NASA Space mission control 21 Research in Vision and Graphics Vision – Stereo and motion • Zabih – Markov Random Fields (MRF’s) • Huttenlocher and Zabih – Pictorial Structures • Huttenlocher – Medical Imaging • Zabih Graphics – Rendering • Bala, Greenberg – Reflectance models and measurements • Marschner, Torrance 22 Vision Research Problem areas – Recognition, determining scene structure, surface properties, motion analysis We have a strong algorithmic focus – Discrete mathematics problem formulation – Theoretical and practical advances Applications – Techniques we developed have played an important role at Xerox and Microsoft, and also resulted in successful startups – Medical imaging – Zabih now joint with Radiology department in NYC 23 Some Research Highlights Hausdorff distance matching methods for tracking and recognition Pictorial structures for recognizing articulated (multipart) objects – Graphical models, Markov random fields, dynamic programming Improved methods for content-based image retrieval Fast and accurate approximation methods for lowlevel vision: stereo, motion – MRF’s, graph cuts 24 Pictorial Structures Model part appearance plus non-local spatial constraints – Image patches describing color, texture, etc. – 2D spatial relations between pairs of patches • Limitations of solid models Simultaneously use appearance+geometry MRF models – random variables on part locations, pairwise constraints 25 Example: Recognizing People 26 MRF’s: Graph Labeling Problems Given Graph Structure 2 Find Labeling 5 2 1 3 5 1 4 3 Label Set 4 Such that Data Cost Function – Data cost is small – Neighbors have similar labels 27 Application: Low-level vision Stereo – Nodes are pixels, labels are disparities (depths) – Adjacent pixels tend to be at similar depths – Data cost comes from the intensity difference • If a pixel p in the left image shows the same point in the scene as p’ in the right, p p’ 28 State of the art in low-level vision Previous algorithms do poorly, especially at object boundaries – Important for almost any application Ideal results Normalized Ourcorrelation results results 29 Program of Computer Graphics (PCG) Founded in 1974 Pioneering research in – physically-based rendering – perception in graphics Four faculty and three research assoc. Alumni 30 Lab Goals 31 Radiosity 32 High-Quality Interactive Rendering Scalable algorithms and systems for complex scenes – Dynamic objects, lighting (global illumination, many lights), highquality shadows – Image-based rendering 33 Rendering Complexity: GI Analytical edges: perceptually important Sparse samples: interactive Edges Edges and samples Output image 1M+ polygons, 20-60x speedup 8-14 fps (interactive) 34 High-Quality Shadows Regular Shadow Maps Adaptive 35 Lighting Complexity: Many lights • 750k polygons, 100 lights • 30x speedup With Masking Image expensive cheap 36 Material Properties UNIVERSITY OF CALGARY Essential to rendering Becoming the limiting factor for realism 37 Real Materials smooth microstructure complex microstructure metals nonmetals 38 Image-based Measurement human skin translucency 39 Some Recent Highlights Continue to be over-represented in top conferences – E.g, siggraph, NAACL – Many student papers Widely used software – E.g., 7000 svm lite downloads Selman named AAAI fellow First place in Trading Agent Competition Many summer internships – NEC, Microsoft Cambridge, Microsoft Research, PARC, Almaden, Watson, Hopkins Language Technology Workshop, Google, ISI 40