Names and Faces Presented By Krishnan Ramnath Robotics Institute, CMU, Pittsburgh Agenda Motivation and Goal Dataset Algorithm Outline Results Goal President George W. Bush makes a statement in the Rose Garden while Secretary of Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States would release graphic photographs of the dead sons of Saddam Hussein to prove they were killed by American troops. Photo by Larry Downing/Reuters Motivation The web has made large multimedia collections readily available - Organise contents Commercial face recognition systems do not work in practice. - Datasets were captured in labs and lack the complexity of real world faces Airport Face Scanner Failed (Wired News) Facial recognition technology tested at the Palm Beach International Airport had a dismal failure rate, according to preliminary results from a pilot program at the facility. The Palm Beach airport tried Visionics' FaceIt system, which snaps photographs of passersby using a security camera and breaks down their facial features into a numeric code that is matched against the photograph database. The system failed to correctly identify airport employees 53 percent of the time, according to test data that was obtained by the American Civil Liberties Union under Florida's open records law. "The preliminary results at the Palm Beach International Airport confirm that the use of facial recognition technology is simply ineffective and of no value," said Randall Marshall, legal director of the state ACLU chapter. Methodology 1) Collect a new dataset from news photographs - Individual faces are not labeled 2) Developed a technique for automatically associating names with faces Applications: Organise news pictures by the individuals present Browse by individual Dataset 500,000 news pictures, associated captions Pictures captured “in the wild” Mainly people: politicians, actors, sports players Multiple names for one face Heavy tailed distribution Multiple faces for one name Dataset Actress Winona Ryder (news) reacts to remarks by prosecutor Ann Rundle during the sentencing hearing in her felony shoplifting case Friday, Dec. 6, 2002 at the Beverly Hills, Calif., courthouse. At right is Ryder's attorney Mark Geragos. Ryder was sentenced to three years of probation and was ordered to perform 480 hours of community service. (AP Photo/Steve Grayson, POOL) France's Amelie Mauresmo has pulled out of the Australian Open (news web sites) because of a knee injury. The French number one, ranked sixth in the world, has been struggling with a painful knee cap for two months and said on December 18, 2002 she would not be fit in time to resume playing in January. Mauresmo is shown during the Kremlin Cup, Oct. 4. (Grigory Dukor/Reuters) Dataset (cont) Doctor Nikola shows a fork that was removed from an Israeli woman who swallowed it while trying to catch a bug that flew in to her mouth, in Poriah Hospital northern Israel July 10, 2003. Doctors performed emergency surgery and removed the fork. (Reuters) President George W. Bush waves as he leaves the White House for a day trip to North Carolina, July 25, 2002. A White House spokesman said that Bush would be compelled to veto Senate legislation creating a new department of homeland security unless changes are made. (Kevin Lamarque/Reuters) Dataset (cont.) Actor Arnold Schwarzenegger discusses his plans for his campaign with reporters after he announced that he will be a candidate for governor of California in the recall election, August 6, 2003. Schwarzenegger ended weeks of speculation about his candidacy for the October 7 referendum. (Fred Producer and director Bruce Paltrow has died at the age of 58 in Rome, Italy, the U.S. Consulate said on October 3, 2002. Paltrow had suffered from throat cancer for several years, but the cause of his death was not immediately known. He is seen with his daughter actress Gwyneth Paltrow after the Academy Awards in Los Angles in March 21, 1999 file photo. (Fred Prouser/Reuters) Algorithm Outline Proper Name Extraction Face Detection Face Rectification Build Face Representation Assign names to faces by clustering (CVPR) Improve clustering by incorporating a language model (NIPS) Algorithm Outline Proper Name Extraction Face Detection Face Rectification Build Face Representation Assign names to faces by clustering (CVPR) Improve clustering by incorporating a language model (NIPS) Proper Name Extraction President George W. Bush makes a statement in the Rose Garden while Secretary of Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States would release graphic photographs of the dead sons of Saddam Hussein to prove they were killed by American troops. Photo by Larry Downing/Reuters Proper Name Extraction Determine proper name strings Captions are very stylized Use morphological rules WordNet for parts of speech President George W. Bush makes a statement in the Rose Garden while Secretary of Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States would release graphic photographs of the dead sons of Saddam Hussein to prove they were killed by American troops. Photo by Larry Downing/Reuters Detected Names: President George, Defense Donald Rumsfeld, Donald Rumsfeld, Saddam Hussein. Proper Name Extraction Extract Names using an open source proper name detector: H. Cunningham, D. Maynard, K. Bontcheva, V. Tablan, “GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications” An Example: Information Extraction Algorithm Outline Proper Name Extraction Face Detection Face Rectification Build Face Representation Assign names to faces by clustering (CVPR) Improve clustering by incorporating a language model (NIPS) Face Detection President George W. Bush makes a statement in the Rose Garden while Secretary of Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States would release graphic photographs of the dead sons of Saddam Hussein to prove they were killed by American troops. Photo by Larry Downing/Reuters Face Detector courtesy of Cordelia Schmid (Schneidermann & Kanade) 45,000 large faces mainly frontal Scaled to 86x86 pixels Face Detector Schneidermann and Kanade Representation by subsets of Quantized Wavelet Coefficients Algorithm Outline Proper Name Extraction Face Detection Face Rectification Build Face Representation Assign names to faces by clustering (CVPR) Improve clustering by incorporating a language model (NIPS) Rectify Faces Feature detectors - trained to recognize nose, corners of eyes and mouth (SVM is used for this) Determine affine transformation between detected and canonical positions Project images into canonical pose Image Features Geometric Blur Feature Similarity (A.) Berg & Malik ‘01 x0 Sparse Signal, S x0 Geometric Blur, B Apply a spatially varying blur, and sample values Authors claim significant gain in rectification accuracy using this Rectification Rectification score helps eliminate errors in face detection Algorithm Outline Proper Name Extraction Face Detection Face Rectification Build Face Representation Assign names to faces by clustering (CVPR) Improve clustering by incorporating a language model (NIPS) Face Representation Faces initially represented as vectors of pixels Huge dataset – Need good representation Dimensionality Reduction Kernel PCA -- dimension reduction onto nonlinear basis. Compute kernel function, K(i,j) for each pair of images Calculate eigenvectors of kernel matrix and project Kernel PCA Use kernel function to compute PCs in high-dimensional feature space Compute a dot product kernel matrix They use a Gaussian kernel for this work Center the matrix by subtracting off avg row, column and adding average element values Compute Eigen decomposition of kernel matrix Compute projections of test point onto the normalized eigenvectors of kernel matrix Non-linear Component Analysis as a Kernel Eigenvalue Problem, Bernhard Scholkopf, Alexander Smola, Klaus-Robert Muller Issues and Hacks Dataset too large – kernel matrix – NxN (N = images) Approximation to calculate eigenvectors of K: Nystrom Approximation – Compute subsets of kernel matrix and use them to approximate whole of kernel matrix Only compute N*m entries in K Only compute eigenvectors of mxm matrices Nystrom Method Solve problem for a random subset of pixels and extrapolate to full set of pixels Partition kernel matrix as: Approximate C with A in this case was computed from 1000 randomly selected images Spectral Grouping using Nystrom Method, Fowlkes, Serge Belongie, Fan Chung, Jitendra Malik Face Representation (cont.) We are dealing with face classes We need to work in the space that is useful for telling classes apart Efficient clustering in discriminant coordinates Linear Discriminants Analysis Discriminants take advantage of class information Don’t know classes Build discriminants using one name faces, project all face vectors into this space Face Representation (cont.) Linear Discriminants Analysis Consider: and Within-Class Scatter Matrix (average scatter of classes ard mean) Between-Class Scatter Matrix (scatter of condnal means ard overall mean) LDA computes projection that maximizes the ratio: By solving the generalised eigenvalue problem: Algorithm Outline Proper Name Extraction Face Detection Face Rectification Build Face Representation Assign names to faces by clustering (EM) Improve clustering by incorporating a language model Name-Face Assignment Name Assignment = hidden variable problem Expectation Maximization over name-face correspondences (hidden variables) Mantra: E: Given an initial name-face clustering, estimate nameface correspondences M: Update clusters using estimated correspondences Name-Face Assignment EM Likelihood of a picture under a given assignment of namesfaces: P(face|name) Complete data log likelihood: Hidden variable Images Correspondences EM vs MM Maximal Assignment Instead of expected value, only maximum likelihood is assigned non-zero probability of 1 EM (soft) vs MM (hard) Claim: MM better Reason: EM assigns weight to incorrect assignments, expected value influenced by wrong assignments Algorithm Outline Proper Name Extraction Face Detection Face Rectification Build Face Representation Assign names to faces by clustering (EM) Improve clustering by incorporating a language model Improved Clustering: Context Understanding Context provides cues as to whether name is pictured or not Incorporate a language model Additions to EM: Find expected name-face (given a face clustering and language model) Update face clusters and language model given correspondences Language Model P(Pictured | Context) Yes/No multiple independent cues Cues: POS tags before and after name, location in caption, distance to closest {. , ( ) (L)(C)(R)}. Cue Combination: Language Model Naïve Bayes model vs Maximum Entropy model Maximum Entropy Model: Specify set of constraints based on statistics Choose model consistent with statistics As uniform as possible Similar performance (ME slightly better) Also: Word and Face context : Not much improvement Algorithm Outline Proper Name Extraction Face Detection Face Rectification Build Face Representation Assign names to faces by clustering (EM) Improve clustering by incorporating a language model Results Results British director Sam Mendes and his partner actress Kate Winslet arrive at the London premiere of 'The Road to Perdition', September 18, 2002. The films stars Tom Hanks as a Chicago hit man who has a separate family life and co-stars Paul Newman and Jude Law. World number one Lleyton Hewitt of Australia hits a return to Nicolas Massu of Chile at the Japan Open tennis championships in Tokyo October 3, 2002. REUTERS/Eriko Sugita Results US President George W. Bush (L) makes remarks while Secretary of State Colin Powell (R) listens before signing the US Leadership Against HIV /AIDS , Tuberculosis and Malaria Act of 2003 at the Department of State in Washington, DC. The five-year plan is designed to help prevent and treat AIDS, especially in more than a dozen African and Caribbean nations(AFP/Luke Frazza) German supermodel Claudia Schiffer gave birth to a baby boy by Caesarian section January 30, 2003, her spokeswoman said. The baby is the first child for both Schiffer, 32, and her husband, British film producer Matthew Vaughn, who was at her side for the birth. Schiffer is seen on the German television show 'Bet It...?!' ('Wetten Dass...?!') in Braunschweig, on January The Results of Adding Language before: CEO Summit after: Martha Stewart before: US House before: Julia Vakulenko before: James Bond after: Andrew Fastow after: Jennifer Capriati after: Pierce Brosnan before: Vice President Dick before: al Qaeda Cheney after: Null after: President George W. before: Marcel Avram after: Michael Jackson Before: Image based clustering After: Image + language based clustering before: Ric Pipino after: Heidi Klum Without Lang Model With Lang Model Caption Labeling IN Pete Sampras IN of the U.S. celebrates his victory over Denmark's OUT Kristian Pless OUT at the OUT U.S. Open OUT at Flushing Meadows August 30, 2002. Sampras won the match 6-3 7- 5 6-4. REUTERS/Kevin Lamarque Germany's IN Chancellor Gerhard Schroeder IN, left, in discussion with France's IN President Jacques Chirac IN on the second day of the EU summit at the European Council headquarters in Brussels, Friday Oct. 25, 2002. EU leaders are to close a deal Friday on finalizing entry talks with 10 candidate countries after a surprise breakthrough agreement on Thursday between France and Germany regarding farm spending.(AP Photo/European Commission/HO) 'The Right Stuff' cast members IN Pamela Reed IN, (L) poses with fellow cast member IN Veronica Cartwright IN at the 20th anniversary of the film in Hollywood, June 9, 2003. The women played wives of astronauts in the film about early United States test pilots and the space program. The film directed by OUT Philip Kaufman OUT, is celebrating its 20th anniversary and is being released on DVD. REUTERS/Fred Prouser Results Face Labeling: Caption Labeling: Web Interface Face Recognition Dataset of 3076 faces (241 individuals) Clusters were hand cleaned to remove erroneous labels Half for testing and half for training PCA (100) + NN = 9.4-15.4% PCA + LDA (50) = 17-27.4% Usual PCA+LDA claim : 80-90%!! Conclusion: Challenging Dataset Online Dataset Online Results What’s the story? Problem Identification : Organising data Dataset collection : News dataset Usage of simple methods Adding Language Information Contributions Good Face Recognition Dataset Organising data on the web (News reels in this case) Acknowledgements Some of the slides for this presentation were obtained from the authors Thanks to Ms.Tamara Berg, UC Berkeley for providing slides and links to online resources Thanks for listening!