Names and Faces Presented By Krishnan Ramnath Robotics Institute, CMU, Pittsburgh

advertisement
Names and Faces
Presented By
Krishnan Ramnath
Robotics Institute, CMU, Pittsburgh
Agenda

Motivation and Goal

Dataset

Algorithm Outline

Results
Goal
President George W. Bush makes a statement in the Rose Garden while Secretary of
Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States
would release graphic photographs of the dead sons of Saddam Hussein to prove they
were killed by American troops. Photo by Larry Downing/Reuters
Motivation

The web has made large multimedia collections
readily available
- Organise contents

Commercial face recognition systems do not work
in practice.
- Datasets were captured in labs and lack the
complexity of real world faces
Airport Face Scanner Failed (Wired News)
Facial recognition technology tested at the Palm Beach International Airport
had a dismal failure rate, according to preliminary results from a pilot
program at the facility.
The Palm Beach airport tried Visionics' FaceIt system, which snaps
photographs of passersby using a security camera and breaks down their
facial features into a numeric code that is matched against the photograph
database.
The system failed to correctly identify airport employees 53 percent of
the time, according to test data that was obtained by the American Civil
Liberties Union under Florida's open records law.
"The preliminary results at the Palm Beach International Airport confirm that
the use of facial recognition technology is simply ineffective and of no
value," said Randall Marshall, legal director of the state ACLU chapter.
Methodology
1) Collect a new dataset from news photographs
- Individual faces are not labeled
2) Developed a technique for automatically associating names
with faces
Applications:

Organise news pictures by
the individuals present

Browse by individual
Dataset

500,000 news pictures, associated captions
 Pictures captured “in the wild”
 Mainly people: politicians, actors, sports players
 Multiple names for one face
 Heavy tailed distribution
 Multiple faces for one name
Dataset
Actress Winona Ryder (news) reacts to
remarks by prosecutor Ann Rundle during
the sentencing hearing in her felony
shoplifting case Friday, Dec. 6, 2002 at the
Beverly Hills, Calif., courthouse. At right is
Ryder's attorney Mark Geragos. Ryder was
sentenced to three years of probation and
was ordered to perform 480 hours of
community service. (AP
Photo/Steve Grayson, POOL)
France's Amelie Mauresmo has pulled
out of the Australian Open (news web sites) because of a knee injury. The
French number one, ranked sixth in the
world, has been struggling with a
painful knee cap for two months and
said on December 18, 2002 she would
not be fit in time to resume playing in
January. Mauresmo is shown during the
Kremlin Cup, Oct. 4. (Grigory
Dukor/Reuters)
Dataset (cont)
Doctor Nikola shows a fork that was
removed from an Israeli woman who
swallowed it while trying to catch a bug
that flew in to her mouth, in Poriah
Hospital northern Israel July 10, 2003.
Doctors performed emergency surgery
and removed the fork. (Reuters)
President George W. Bush waves as he
leaves the White House for a day trip to
North Carolina, July 25, 2002. A White
House spokesman said that Bush would
be compelled to veto Senate legislation
creating a new department of homeland
security unless changes are made. (Kevin
Lamarque/Reuters)
Dataset (cont.)
Actor Arnold Schwarzenegger
discusses his plans for his campaign
with reporters after he announced
that he will be a candidate for
governor of California in the recall
election, August 6, 2003.
Schwarzenegger ended weeks of
speculation about his candidacy for
the October 7 referendum. (Fred
Producer and director Bruce Paltrow has died
at the age of 58 in Rome, Italy, the U.S.
Consulate said on October 3, 2002. Paltrow had
suffered from throat cancer for several years,
but the cause of his death was not immediately
known. He is seen with his daughter actress
Gwyneth Paltrow after the Academy Awards
in Los Angles in March 21, 1999 file photo.
(Fred Prouser/Reuters)
Algorithm Outline
Proper Name Extraction
Face Detection
Face Rectification
Build Face Representation
Assign names to faces by clustering (CVPR)
Improve clustering by incorporating a language
model (NIPS)
Algorithm Outline
Proper Name Extraction
 Face Detection
 Face Rectification
 Build Face Representation
 Assign names to faces by clustering (CVPR)
 Improve clustering by incorporating a language
model (NIPS)

Proper Name Extraction
President George W. Bush makes a
statement in the Rose Garden while
Secretary of Defense Donald Rumsfeld
looks on, July 23, 2003. Rumsfeld said
the United States would release graphic
photographs of the dead sons of
Saddam Hussein to prove they were
killed by American troops. Photo by
Larry Downing/Reuters
Proper Name Extraction

Determine proper name
strings
 Captions
are very stylized
 Use morphological rules
 WordNet for parts of speech
President George W. Bush makes a
statement in the Rose Garden while
Secretary of Defense Donald Rumsfeld
looks on, July 23, 2003. Rumsfeld said
the United States would release graphic
photographs of the dead sons of
Saddam Hussein to prove they were
killed by American troops. Photo by
Larry Downing/Reuters
Detected Names:
President George,
Defense Donald Rumsfeld,
Donald Rumsfeld,
Saddam Hussein.
Proper Name Extraction
 Extract Names using an open
source proper name detector:
H. Cunningham, D. Maynard, K.
Bontcheva, V. Tablan, “GATE: A
Framework and Graphical
Development Environment for Robust
NLP Tools and Applications”
 An Example:
 Information Extraction
Algorithm Outline
Proper Name Extraction
 Face Detection
 Face Rectification
 Build Face Representation
 Assign names to faces by clustering (CVPR)
 Improve clustering by incorporating a language
model (NIPS)

Face Detection

President George W. Bush makes a
statement in the Rose Garden while
Secretary of Defense Donald Rumsfeld
looks on, July 23, 2003. Rumsfeld said
the United States would release graphic
photographs of the dead sons of
Saddam Hussein to prove they were
killed by American troops. Photo by
Larry Downing/Reuters
Face Detector
 courtesy
of Cordelia
Schmid (Schneidermann &
Kanade)

45,000 large faces
 mainly
frontal
 Scaled to 86x86 pixels
Face Detector


Schneidermann and Kanade
Representation by subsets of Quantized Wavelet
Coefficients
Algorithm Outline
Proper Name Extraction
 Face Detection
 Face Rectification
 Build Face Representation
 Assign names to faces by clustering (CVPR)
 Improve clustering by incorporating a language
model (NIPS)

Rectify Faces



Feature detectors - trained to recognize nose,
corners of eyes and mouth (SVM is used for this)
Determine affine transformation between detected
and canonical positions
Project images into canonical pose
Image Features
Geometric Blur Feature Similarity
(A.) Berg & Malik ‘01
x0
Sparse Signal, S
x0
Geometric Blur, B
Apply a spatially varying blur, and sample values
Authors claim significant gain in rectification accuracy using this
Rectification

Rectification score helps eliminate errors in face
detection
Algorithm Outline
Proper Name Extraction
 Face Detection
 Face Rectification
 Build Face Representation
 Assign names to faces by clustering (CVPR)
 Improve clustering by incorporating a language
model (NIPS)

Face Representation




Faces initially represented as vectors of pixels
Huge dataset – Need good representation
Dimensionality Reduction
Kernel PCA -- dimension reduction onto nonlinear basis.
 Compute kernel function, K(i,j) for each pair of images
 Calculate eigenvectors of kernel matrix and project
Kernel PCA
Use kernel function to compute PCs in high-dimensional
feature space

Compute a dot product kernel matrix
 They use a Gaussian kernel for this work

Center the matrix by subtracting off avg row, column and
adding average element values

Compute Eigen decomposition of kernel matrix

Compute projections of test point onto the normalized
eigenvectors of kernel matrix
Non-linear Component Analysis as a Kernel Eigenvalue Problem, Bernhard
Scholkopf, Alexander Smola, Klaus-Robert Muller
Issues and Hacks

Dataset too large – kernel matrix – NxN
(N = images)

Approximation to calculate eigenvectors of K:
 Nystrom Approximation
–
Compute subsets of kernel matrix and use them to
approximate whole of kernel matrix
 Only compute N*m entries in K
 Only compute eigenvectors of mxm matrices

Nystrom Method
Solve problem for a random subset of pixels and extrapolate
to full set of pixels

Partition kernel matrix as:
 Approximate
C with
 A in
this case was computed from 1000 randomly selected
images
Spectral Grouping using Nystrom Method, Fowlkes, Serge Belongie, Fan
Chung, Jitendra Malik
Face Representation (cont.)



We are dealing with face classes
We need to work in the space that is useful for telling
classes apart
Efficient clustering in discriminant coordinates
Linear
Discriminants Analysis
 Discriminants take advantage of class information
 Don’t know classes
 Build discriminants using one name faces, project all
face vectors into this space
Face Representation (cont.)
Linear Discriminants Analysis
 Consider:
and
 Within-Class Scatter Matrix (average scatter of classes ard mean)
 Between-Class Scatter Matrix (scatter of condnal means ard
overall mean)
 LDA computes projection that maximizes the ratio:
 By solving the generalised eigenvalue problem:
Algorithm Outline
Proper Name Extraction
 Face Detection
 Face Rectification
 Build Face Representation
 Assign names to faces by clustering (EM)
 Improve clustering by incorporating a language
model

Name-Face Assignment



Name Assignment = hidden variable problem
Expectation Maximization over name-face
correspondences (hidden variables)
Mantra:
 E:
Given an initial name-face clustering, estimate nameface correspondences
 M: Update clusters using estimated correspondences
Name-Face Assignment
EM

Likelihood of a picture under a given assignment of namesfaces:
P(face|name)

Complete data log likelihood:
Hidden variable
Images
Correspondences
EM vs MM

Maximal Assignment
 Instead of expected value, only maximum
likelihood is assigned non-zero probability of 1

EM (soft) vs MM (hard)
Claim: MM better
Reason: EM assigns weight to incorrect assignments,
expected value influenced by wrong assignments


Algorithm Outline
Proper Name Extraction
 Face Detection
 Face Rectification
 Build Face Representation
 Assign names to faces by clustering (EM)
 Improve clustering by incorporating a language
model

Improved Clustering:
Context Understanding

Context provides cues as to whether name is pictured or not

Incorporate a language model

Additions to EM:


Find expected name-face (given a face clustering and language model)
Update face clusters and language model given correspondences
Language Model
P(Pictured | Context)
Yes/No
multiple independent
cues
Cues: POS tags before and after name, location in caption, distance
to closest {. , ( ) (L)(C)(R)}.
Cue Combination:
Language Model

Naïve Bayes model vs Maximum Entropy model

Maximum Entropy Model:
 Specify set of constraints based on statistics
 Choose model consistent with statistics
 As uniform as possible

Similar performance (ME slightly better)

Also: Word and Face context : Not much improvement
Algorithm Outline







Proper Name Extraction
Face Detection
Face Rectification
Build Face Representation
Assign names to faces by clustering (EM)
Improve clustering by incorporating a language model
Results
Results
British director Sam Mendes
and his partner actress Kate
Winslet arrive at the London
premiere of 'The Road to
Perdition', September 18, 2002.
The films stars Tom Hanks as a
Chicago hit man who has a
separate family life and co-stars
Paul Newman and Jude Law.
World number one Lleyton
Hewitt of Australia hits a return to
Nicolas Massu of Chile at the
Japan Open tennis
championships in Tokyo October
3, 2002. REUTERS/Eriko Sugita
Results
US President George W. Bush (L)
makes remarks while Secretary of State
Colin Powell (R) listens before signing
the US Leadership Against HIV /AIDS ,
Tuberculosis and Malaria Act of 2003 at
the Department of State in Washington,
DC. The five-year plan is designed to
help prevent and treat AIDS, especially
in more than a dozen African and
Caribbean nations(AFP/Luke Frazza)
German supermodel Claudia Schiffer
gave birth to a baby boy by Caesarian
section January 30, 2003, her
spokeswoman said. The baby is the first
child for both Schiffer, 32, and her
husband, British film producer Matthew
Vaughn, who was at her side for the
birth. Schiffer is seen on the German
television show 'Bet It...?!' ('Wetten
Dass...?!') in Braunschweig, on January
The Results of Adding Language
before: CEO Summit
after: Martha Stewart
before: US House
before: Julia Vakulenko before: James Bond
after: Andrew Fastow after: Jennifer Capriati after: Pierce Brosnan
before: Vice President Dick before: al Qaeda
Cheney
after: Null
after: President George W.
before: Marcel Avram
after: Michael Jackson
Before: Image based clustering
After: Image + language based clustering
before: Ric Pipino
after: Heidi Klum
Without Lang Model
With Lang Model
Caption Labeling
IN Pete Sampras IN of the U.S. celebrates his victory over Denmark's OUT Kristian Pless OUT at the
OUT U.S. Open OUT at Flushing Meadows August 30, 2002. Sampras won the match 6-3 7- 5 6-4.
REUTERS/Kevin Lamarque
Germany's IN Chancellor Gerhard Schroeder IN, left, in discussion with France's IN President Jacques
Chirac IN on the second day of the EU summit at the European Council headquarters in Brussels, Friday
Oct. 25, 2002. EU leaders are to close a deal Friday on finalizing entry talks with 10 candidate countries
after a surprise breakthrough agreement on Thursday between France and Germany regarding farm
spending.(AP Photo/European Commission/HO)
'The Right Stuff' cast members IN Pamela Reed IN, (L) poses with fellow cast member IN Veronica
Cartwright IN at the 20th anniversary of the film in Hollywood, June 9, 2003. The women played wives of
astronauts in the film about early United States test pilots and the space program. The film directed by OUT
Philip Kaufman OUT, is celebrating its 20th anniversary and is being released on DVD. REUTERS/Fred
Prouser
Results
Face Labeling:
Caption Labeling:
Web Interface
Face Recognition









Dataset of 3076 faces (241 individuals)
Clusters were hand cleaned to remove erroneous labels
Half for testing and half for training
PCA (100) + NN = 9.4-15.4%
PCA + LDA (50) = 17-27.4%
Usual PCA+LDA claim : 80-90%!!
Conclusion: Challenging Dataset
Online Dataset
Online Results
What’s the story?





Problem Identification : Organising data
Dataset collection : News dataset
Usage of simple methods
Adding Language Information
Contributions
 Good Face Recognition Dataset
 Organising data on the web (News reels in this case)
Acknowledgements



Some of the slides for this presentation were obtained from the
authors
Thanks to Ms.Tamara Berg, UC Berkeley for providing slides
and links to online resources
Thanks for listening!
Download