Scene Completion Using Millions of Photographs James Hays, Alexei A. Efros Carnegie Mellon University ACM SIGGRAPH 2007 Outline Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion Outline Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion Introduction Image completion(inpainting, hole-filling) Filling in or replacing an image region with new image data such that the modification can not be detected Introduction The data could have been there The data should have been there Introduction The existing methods operate by extending adjacent textures and contours into the unknown region Filling in the unknown region with content from the known parts of the input image Introduction The assumption is that all the necessary image data to fill in an unknown region is located somewhere else in the same image This assumption is flawed Outline Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion Overview We perform image completion by leveraging a massive database of images Two compelling reasons A region will be impossible to fill plausibly using only image data from the source image Reusing that content would often leave obvious duplications Overview There are several challenges with drawing content from other images Computational Semantically invalid Seamlessly Overview Alleviate computational and semantic Find images depicting semantically similar scenes Use only the best matching scenes to find patches which match the content surrounding the missing region Seamlessly combine image regions Graph cut segmentation Poisson blending Outline Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion Semantic Scene Matching Our image database Download images in thirty Flickr.com groups Download images based on keyword searches Discarded duplicate images and images that are too small Distributed among a cluster of 15 machines Acquir about 2.3 million unique images Semantic Scene Matching Look for scenes which are most likely to be semantically equivalent to the image requiring completion GIST descriptor Augment the scene descriptor with color information of the query image down-sampled to the spatial resolution of the gist Semantic Scene Matching Given an input image to be hole-filled, we first compute its gist descriptor with the missing regions excluded We calculate the SSD between the the gist of the query image and every gist in the database The color difference is computed in the lab color space Outline Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion Local Context Matching Having constrained our search to semantically similar scenes we can use Template matching to more precisely align Local Context Matching Pixel-wise alignment score We define the local context to be all pixels within an 80 pixel radius of the hole’s boundary This context is compared against the 200 best matching scenes Using SSD error in lab color space Local Context Matching Texture similarity score Measure coarse compatibility of the proposed fill-in region to the source image within the local context Computed as a 5x5 median filter of image gradient magnitude at each pixel The descriptors of the two images are compared via SSD Local Context Matching Composite each matching scene into the incomplete image at its best placement using a form of graph cut seam finding and standard poisson blending Local Context Matching Past image completion algorithms The remaining valid pixels in an image can not changed Our completion algorithms Allow to remove valid pixels from the query image But discourage the cutting of too many pixels Local Context Matching Past seam-finding Minimum intensity difference between two images Cause the seam to pass through many high frequency edges Our seam-finding Minimum the gradient of the image difference along the seam Local Context Matching We find the seam by minimizing the following cost function Cd ( p, L( p)) : unary costs of assigning any pixel p, to a specific label L(p) L(p) : patch or exist Local Context Matching For missing regions of the existing image is a very large number For regions of the image not covered by the scene match Cd ( p, exist) Cd ( p, patch) is a very large number For all other pixels is pixel’s distance from the hole k = 0.02 Local Context Matching is non-zero only for immediately adjacent, 4-way connected pixels Ci ( p, q, L( p), L(q)) L(p) = L(q), the cost is zero L(p) L(q), Ci ( p, q, L( p), L(q)) diff ( p, q) diff ( p, q) is the magnitude of the gradient of the SSD between the existing image and the scene match at pixels p and q Local Context Matching Finally we assign each composite a score The scene matching distance The local context matching distance The local texture similarity distance The cost of the graph cut We present the user with the 20 composites with the lowest scores Local Context Matching Outline Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion Results and Comparison Results and Comparison Results and Comparison Results and Comparison Results and Comparison Lucky Find another image from the same physical location It is not our goal to complete scenes and objects with their true selves in the database Results and Comparison Results and Comparison Results and Comparison Failure cases : artifact Results and Comparison Failure cases : semantic violations Results and Comparison Failure cases : no object recognition Results and Comparison Failure cases : past methods perform well For uniformly textured backgrounds Our method is unlikely to find the exact same texture in another photograph Outline Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion Conclusion This paper Present a new image completion algorithm powered by a huge database. Unlike past methods that reuse visual data within the source image. Further work Two million images are still a tiny fraction of the high quality photograph available. Our approach would be an attractive web-base application. Thank you!!!