Active Learning - Computer Science

advertisement
Student Visit Day
AI, Vision, Graphics
24 March 2003
(Primary) Faculty











Claire Cardie – natural language, learning
Rich Caruana – learning
Joe Halpern – reasoning under uncertainty, agents
Thorsten Joachims – learning, text
Lillian Lee – natural language
Bart Selman – reasoning, hard problems (SAT)
Dan Huttenlocher – recognition, tracking
Ramin Zabih – stereo, medical imaging
Steve Marschner – surface reflectance
Kavita Bala – rendering, image-based rendering
Don Greenberg – rendering, modeling
2
Faculty With Related Interests












Bob Constable – automated reasoning
Shimon Edelman – computational vision
Eric Friedman – game theory, agents
Johannes Gehrke – data mining, clustering
Carla Gomes – reasoning
John Hopcroft – clustering
Jon Kleinberg – clustering, agents
Hod Lipson – neural nets
Mats Rooth – computational linguistics
Eva Tardos – game theory, agents
Ken Torrance - reflectance
Golan Yona – learning applied to comp bio
3
Activities/Centers in the Areas
 Weekly seminars
– AI
– NLP
– Graphics
– Information Science
– Bio-informatics
 Intelligent Information Systems Institute (IISI)
 Cognitive Studies Program
 Program of Computer Graphics
4
Research in AI
 Clustering
– Cardie, Caruana, Hopcroft, Kleinberg, Lee, Selman, Yona
 Learning
– Cardie, Caruana, Joachims, Lee, Yona
 Natural Language Processing (NLP)
– Cardie, Lee, Joachims
 Multi-Agent Systems
– Halpern, Selman
 Reasoning Under Uncertainty
– Halpern, Huttenlocher, Zabih
 Knowledge Representation and Reasoning
– Gomes, Selman
5
Clustering
 Large linked networks
– Study of evolving structure / communities in very
large linked networks
– CiteSeer project (network of approx. 2 million
papers; 5 million citations; 400,000 scientists)
 Agglomerative clustering
– Stable clusters reflecting “communities”
• E.g., stability over time
– Networks such as CiteSeer citations exhibit stable
clusters not found in random graphs
6
7
Is There a Unified Foundation
for Clustering?
 Why so many ways to formalize the clustering problem?
– (Agglomerative, centroid-based, information-theoretic, spectral,
probabilistic generative models, discrete optimization models ... )
 To get some insight : an impossibility result.
– Informally, no clustering function that simultaneously: (1) is scale-invariant,
(2) achieves any partition as a possible output, and (3) is “consistent” under
stretching and shrinking of distances.
 Upshot: inherent trade-offs in the definition of the clustering
problem.
8
Learning Ranking Functions
 Classification Functions:
– “A is good”, “B is bad”
– Often not appropriate
 Ranking Functions:
– “A is better than B”
– Information Retrieval
– Protein Folding
(Ron Elber)
282,000 hits
 Example:
Improve Search Engine
using Machine Learning
9
Support Vector Machine




Example: Text Classification
– SVMs are state-of-the-art
– Many features (words)
– Few training examples
Statistical Learning Theory
– Ability to learn doesn’t necessarily depend
on number of features
Kernels
– Make classifiers non-linear
– Learning on gene sequences, parse trees,
graphs, etc.
Efficient Training
– SVM-light
– 30,000,000 examples
– 500,000 features (sparse)
aardvark
0
…
AOL buys Time Warner
announced
Officials announced
today that they finalized
the negotiations about
the merger of America
Online and Time
Warner. The …
1
…
time
2
toad
0
toll
0
…
warner
2
zztop
0
d
d
d
10
• Learning-to-learn, Inductive
Transfer, Multitask Learning
• Machine Learning for
Medical Decision Making
• learning rankings
• modeling standard practice
• C-section prediction
Support Vector Machines
Neural Nets
• Extreme Ensemble Learning
• new approach
• based on student project
• best algorithm on planet?
K-Nearest Neighbor
Decision Trees
1000’s of models!
Selected Ensemble
11
Natural Language Processing
Goal: get computers to use human languages (e.g.,
English)
We’ve developed state-of-the-art approaches in many end-to-end
NLP applications including information extraction, multi-document
summarization, factual question answering, opinion-based question
answering, ...
12
Research Theme: Weakly-Supervised
Learning
Automatic extraction of useful information from language samples
using minimal knowledge resources.
Enhances efficient portability to different domains (languages,
subfields, ...).
Are “apple” and “sun” similar?
Weakly-supervised learning: Leverage large unlabeled datasets,
given a small amount of labeled data (and human patience)
13
Mostly-Unsupervised Learning
Japanese, Chinese, Thai, ...: no spaces between words
theyouthevent
Combining simple statistics from unsegmented Japanese newswire
yields results rivaling grammar-based approaches.
[Ando/Lee 2000, 2003]
14
Active Learning
Problem: partial parsing — find grammatical units
I saw [her duck] [with a telescope].
Active learning: a machine learning technique selectively chooses
examples to ask a human to label with the right answers
Active learning can outperform
human labeling of the entire
[Pierce/Cardie 2003].
full
dataset
15
Transduction
 Given:
– Little training set
– Large test set
 How can we minimize the
number of errors on this test
set?
 Text Classification
physics
Transductive SVM
SVM
class
nuclear
atom
pepper
D1
+
1
D2
?
1
D3
?
D4
?
1
D5
?
1
D6
-
basil
salt
and
1
1
1
1
1
1
1
1
1
1
1
1
1
16
Multi-Agent Systems:
Reasoning About Knowledge
 Modeling statements like “I know that you know
that I know that you know . . . ”
– Basic ideas go back to philosophers in 1950s
– Applications arise in distributed systems and game
theory
• Common knowledge necessary for coordination
– Current application: reasoning about security
– Getting good models of resource-bounded reasoning
is key
– Bringing in probabilistic reasoning is also key
17
Games and Mechanisms in
Multi-Agent Systems
 Networks of Strategic Agents
– Take into account how link structure of a network affects
interactions
• Can be physical, social, financial networks
– Spread of influence; spread of viruses
– One research thrust: trading in illiquid markets
• Effect of broker-client relationships
 Trading and Combinatorial Auctions
– Using combinatorial search algorithms to “solve” auctions
• Yannis Vetsikas (Ph.D. student) won first place in
annual Trading Agent Competition
18
Causality and Explanation
 What does it mean that
– Event A causes event B? Event A is an explanation of B?
 Precise, useful definitions notoriously difficult
– Philosphers working on it for millenia
– Halpern and Pearl have given a formal definition, using Pearl’s
model of structural equations.
 Current work: Extending definitions to notions of responsibility and
blame
– Suppose a firing squad consisting of 10 excellent marksmen
shoots a prisoner
– Only one has a live bullet, but none of them know which one has
it
– The marksman with the live bullet is responsible for the death,
each marksman has degree of blame 1/10.
19
Reasoning About Uncertainty
 Modeling likelihood using probability and nonprobabilistic approaches
– Plausibility measures, generalizes probability and all
other standard models of uncertainty
– Using formal decision theory to model decision
making in systems.
 Markov models with hidden state – HMM’s
– NDFA where each state has distribution over labels
rather than single label
• Efficient new algorithms for large state spaces
• Similar techniques used in vision (MRF’s)
20
Large Scale Reasoning Engines
Our inference methods can handle problems with over
one million variables and five million constraints.
– Previously only hundreds of constraints
Novel applications:
1) Planning
2) Automated design and verification
3) NASA Space mission control
21
Research in Vision and Graphics


Vision
– Stereo and motion
• Zabih
– Markov Random Fields (MRF’s)
• Huttenlocher and Zabih
– Pictorial Structures
• Huttenlocher
– Medical Imaging
• Zabih
Graphics
– Rendering
• Bala, Greenberg
– Reflectance models and measurements
• Marschner, Torrance
22
Vision Research
 Problem areas
– Recognition, determining scene structure, surface
properties, motion analysis
 We have a strong algorithmic focus
– Discrete mathematics problem formulation
– Theoretical and practical advances
 Applications
– Techniques we developed have played an important role at
Xerox and Microsoft, and also resulted in successful
startups
– Medical imaging – Zabih now joint with Radiology
department in NYC
23
Some Research Highlights
 Hausdorff distance matching methods for tracking
and recognition
 Pictorial structures for recognizing articulated (multipart) objects
– Graphical models, Markov random fields, dynamic
programming
 Improved methods for content-based image retrieval
 Fast and accurate approximation methods for lowlevel vision: stereo, motion
– MRF’s, graph cuts
24
Pictorial Structures
 Model part appearance plus non-local spatial
constraints
– Image patches describing color, texture, etc.
– 2D spatial relations between pairs of patches
• Limitations of solid models
 Simultaneously use appearance+geometry
 MRF models – random variables on part locations,
pairwise constraints
25
Example: Recognizing People
26
MRF’s: Graph Labeling Problems
Given
Graph Structure
2
Find
Labeling
5
2
1
3
5
1
4
3
Label Set
4
Such that
Data Cost Function
– Data cost is small
– Neighbors have similar labels
27
Application: Low-level vision
Stereo
– Nodes are pixels, labels are disparities (depths)
– Adjacent pixels tend to be at similar depths
– Data cost comes from the intensity difference
• If a pixel p in the left image shows the same point in
the scene as p’ in the right, p  p’
28
State of the art in low-level vision
 Previous algorithms do poorly, especially at
object boundaries
– Important for almost any application
Ideal results
Normalized
Ourcorrelation
results results
29
Program of Computer Graphics (PCG)
 Founded in 1974
 Pioneering research in
– physically-based rendering
– perception in graphics
 Four faculty and three
research assoc.
 Alumni
30
Lab Goals
31
Radiosity
32
High-Quality Interactive Rendering
 Scalable algorithms and systems for complex scenes
– Dynamic objects, lighting (global illumination, many lights), highquality shadows
– Image-based rendering
33
Rendering Complexity: GI
 Analytical edges: perceptually important
 Sparse samples: interactive
Edges
Edges and samples
Output image
 1M+ polygons, 20-60x speedup
 8-14 fps (interactive)
34
High-Quality Shadows
Regular Shadow Maps
Adaptive
35
Lighting Complexity: Many lights
• 750k polygons, 100 lights
• 30x speedup
With Masking
Image
expensive
cheap
36
Material Properties
UNIVERSITY OF CALGARY
 Essential to rendering
 Becoming the limiting factor for realism
37
Real Materials
smooth
microstructure
complex
microstructure
metals
nonmetals
38
Image-based Measurement
human skin
translucency
39
Some Recent Highlights
 Continue to be over-represented in top conferences
– E.g, siggraph, NAACL
– Many student papers
 Widely used software
– E.g., 7000 svm lite downloads
 Selman named AAAI fellow
 First place in Trading Agent Competition
 Many summer internships
– NEC, Microsoft Cambridge, Microsoft Research, PARC,
Almaden, Watson, Hopkins Language Technology Workshop,
Google, ISI
40
Download