Scene Completion Using Millions of Photographs

advertisement
Scene Completion Using
Millions of Photographs
James Hays, Alexei A. Efros
Carnegie Mellon University
ACM SIGGRAPH 2007
Outline






Introduction
Overview
Semantic Scene Matching
Local Context Matching
Results and Comparison
Conclusion
Outline






Introduction
Overview
Semantic Scene Matching
Local Context Matching
Results and Comparison
Conclusion
Introduction

Image completion(inpainting, hole-filling)

Filling in or replacing an image region with new
image data such that the modification can not
be detected
Introduction

The data could have been there

The data should have been there
Introduction

The existing methods operate by extending
adjacent textures and contours into the
unknown region

Filling in the unknown region with content from
the known parts of the input image
Introduction

The assumption is that all the necessary
image data to fill in an unknown region is
located somewhere else in the same image

This assumption is flawed
Outline






Introduction
Overview
Semantic Scene Matching
Local Context Matching
Results and Comparison
Conclusion
Overview


We perform image completion by leveraging
a massive database of images
Two compelling reasons


A region will be impossible to fill plausibly
using only image data from the source image
Reusing that content would often leave obvious
duplications
Overview

There are several challenges with drawing
content from other images



Computational
Semantically invalid
Seamlessly
Overview

Alleviate computational and semantic



Find images depicting semantically similar
scenes
Use only the best matching scenes to find
patches which match the content surrounding
the missing region
Seamlessly combine image regions


Graph cut segmentation
Poisson blending
Outline






Introduction
Overview
Semantic Scene Matching
Local Context Matching
Results and Comparison
Conclusion
Semantic Scene Matching

Our image database





Download images in thirty Flickr.com groups
Download images based on keyword searches
Discarded duplicate images and images that are
too small
Distributed among a cluster of 15 machines
Acquir about 2.3 million unique images
Semantic Scene Matching

Look for scenes which are most likely to be
semantically equivalent to the image
requiring completion


GIST descriptor
Augment the scene descriptor with color
information of the query image down-sampled
to the spatial resolution of the gist
Semantic Scene Matching



Given an input image to be hole-filled, we
first compute its gist descriptor with the
missing regions excluded
We calculate the SSD between the the gist of
the query image and every gist in the
database
The color difference is computed in the lab
color space
Outline






Introduction
Overview
Semantic Scene Matching
Local Context Matching
Results and Comparison
Conclusion
Local Context Matching

Having constrained our search to
semantically similar scenes we can use
Template matching to more precisely align
Local Context Matching

Pixel-wise alignment score



We define the local context to be all pixels
within an 80 pixel radius of the hole’s boundary
This context is compared against the 200 best
matching scenes
Using SSD error in lab color space
Local Context Matching

Texture similarity score



Measure coarse compatibility of the proposed
fill-in region to the source image within the
local context
Computed as a 5x5 median filter of image
gradient magnitude at each pixel
The descriptors of the two images are
compared via SSD
Local Context Matching

Composite each matching scene into the
incomplete image at its best placement
using a form of graph cut seam finding and
standard poisson blending
Local Context Matching

Past image completion algorithms


The remaining valid pixels in an image can not
changed
Our completion algorithms


Allow to remove valid pixels from the query
image
But discourage the cutting of too many pixels
Local Context Matching

Past seam-finding



Minimum intensity difference between two
images
Cause the seam to pass through many high
frequency edges
Our seam-finding

Minimum the gradient of the image difference
along the seam
Local Context Matching

We find the seam by minimizing the
following cost function


Cd ( p, L( p)) : unary costs of assigning any pixel p,
to a specific label L(p)
L(p) : patch or exist
Local Context Matching

For missing regions of the existing image


is a very large number
For regions of the image not covered by the
scene match


Cd ( p, exist)
Cd ( p, patch)
is a very large number
For all other pixels

is pixel’s distance from the hole


k = 0.02
Local Context Matching

is non-zero only for
immediately adjacent, 4-way connected
pixels
Ci ( p, q, L( p), L(q))



L(p) = L(q), the cost is zero
L(p)  L(q), Ci ( p, q, L( p), L(q))  diff ( p, q)
diff ( p, q) is the magnitude of the gradient of the
SSD between the existing image and the scene
match at pixels p and q
Local Context Matching

Finally we assign each composite a score





The scene matching distance
The local context matching distance
The local texture similarity distance
The cost of the graph cut
We present the user with the 20 composites
with the lowest scores
Local Context Matching
Outline






Introduction
Overview
Semantic Scene Matching
Local Context Matching
Results and Comparison
Conclusion
Results and Comparison
Results and Comparison
Results and Comparison
Results and Comparison
Results and Comparison

Lucky


Find another image from the same physical
location
It is not our goal to complete scenes and objects
with their true selves in the database
Results and Comparison
Results and Comparison
Results and Comparison

Failure cases : artifact
Results and Comparison

Failure cases : semantic violations
Results and Comparison

Failure cases : no object recognition
Results and Comparison

Failure cases : past methods perform well


For uniformly textured backgrounds
Our method is unlikely to find the exact same
texture in another photograph
Outline






Introduction
Overview
Semantic Scene Matching
Local Context Matching
Results and Comparison
Conclusion
Conclusion

This paper



Present a new image completion algorithm
powered by a huge database.
Unlike past methods that reuse visual data
within the source image.
Further work


Two million images are still a tiny fraction of the
high quality photograph available.
Our approach would be an attractive
web-base application.
Thank you!!!
Download