Image and Depth from a Conventional Camera with a Coded Aperture

advertisement
Image and Depth from a
Conventional Camera with a
Coded Aperture
Anat Levin, Rob Fergus,
Frédo Durand, William Freeman
MIT CSAIL
Output #1: Depth map
Single input image:
Output #2: All-focused image
Lens and defocus
Image of a point
light source
Lens’ aperture
Lens
Camera
sensor
Point
spread
function
Focal plane
Lens and defocus
Image of a
defocused point
light source
Lens’ aperture
Object
Lens
Camera
sensor
Point
spread
function
Focal plane
Lens and defocus
Image of a
defocused point
light source
Lens’ aperture
Object
Lens
Camera
sensor
Point
spread
function
Focal plane
Lens and defocus
Image of a
defocused point
light source
Lens’ aperture
Object
Lens
Camera
sensor
Point
spread
function
Focal plane
Depth and defocus
Out of focus
Depth from defocus:
Infer depth by analyzing
local scale of defocus blur
In focus
Challenges
• Hard to discriminate a smooth scene from defocus blur
?
Out of focus
• Hard to undo defocus blur
Input
Ringing with conventional
deblurring algorithm
Key contributions
• Exploit prior on natural images
- Improve deconvolution
- Improve depth discrimination
Natural
• Coded aperture (mask inside lens)
- make defocus patterns different from
natural images and easier to discriminate
Unnatural
Related Work
•Active methods
e.g. Nayar et al. , Zhang and Nayar
•Depth from (de)focus
e.g. Pentland, Chaudhuri, Favaro et al.
• Plenoptic/ light field cameras
e.g. Adelson and Wang, Ng et al.
• Wave front coding
e.g. Cathey & Dowski
•Blind Deconvolution
e.g. Kundur and Hatzinakos , Fergus et al, Levin
Never recover both depth AND full resolution image from a single image
Defocus as local convolution
Calibrated blur kernels at
different depths
Input defocused
image
Defocus as local convolution
y  fk  x
Local
sub-window
Input defocused
image
Calibrated
blur kernels
at depth k
Sharp
sub-window
Depth k=1:
y  fk  x
Depth k=2:
y  fk  x
Depth k=3:
y  fk  x
Overview
Try deconvolving local input windows with different scaled filters:
 
 
 
Somehow: select best scale.
?
?
?
Larger scale
Correct scale
Smaller scale
Challenges
• Hard to deconvolve even
when kernel is known
Input
• Hard to identify
correct scale:
?
?
?
 
 
 
Ringing with the traditional
Richardson-Lucy deconvolution
algorithm
Larger scale
Correct scale
Smaller scale
Deconvolution is ill posed
f x y

?
=
Linear, but rank deficient
Deconvolution is ill posed
f x y
Solution 1:

?
=
?
=
Solution 2:

Linear, but rank deficient
Image Restoration Techniques
• Frequency domain
– Wiener filter, deconvwnr()
– Direct inverse method
? 
Clear
=
Kernel
Blur
X
• Spatial domain
– Richardson-Lucy method, deconvlucy()
– MAP estimator, iterative method
– Slow convergence for high freq. (needs priors)
Idea 1: Natural images prior
What makes images special?
Natural
Image
gradient
Unnatural
Natural Image Statistics
Characteristic distribution with heavy tails
Histogram of image gradients
Blurry image has different statistics
Histogram of image gradients
Deconvolution with prior
| f  x  y |   i  (xi )
x  arg min
2
Convolution error

?
_
Derivatives prior
2
+
Low
Equal convolution error
2

?
_
+
High
Comparing deconvolution algorithms
(Non blind) deconvolution code available online:
http://groups.csail.mit.edu/graphics/CodedAperture/
Input
Richardson-Lucy
 (x)  x
2
 (x)  x
0.8
“spread” gradients
“localizes” gradients
Gaussian prior
Sparse prior
Comparing deconvolution algorithms
(Non blind) deconvolution code available online:
http://groups.csail.mit.edu/graphics/CodedAperture/
Input
Richardson-Lucy
 (x)  x
2
 (x)  x
0.8
“spread” gradients
“localizes” gradients
Gaussian prior
Sparse prior
Recall: Overview
Try deconvolving local input windows with different scaled filters:
 
 
 
Larger scale
Correct scale
Smaller scale
?
?
?
Somehow: select best scale.
Challenge: smaller scale not so different than correct
Idea 2: Coded Aperture
• Mask (code) in aperture plane
- make defocus patterns different from
natural images and easier to discriminate
Conventional
aperture
Our coded
aperture
Solution: lens with occluder
Object
Lens
Camera
sensor
Point
spread
function
Focal plane
Solution: lens with occluder
Image of a
defocused point
light source
Aperture pattern
Object
Lens with coded
aperture
Camera
sensor
Point
spread
function
Focal plane
Solution: lens with occluder
Image of a
defocused point
light source
Aperture pattern
Object
Lens with coded
aperture
Camera
sensor
Point
spread
function
Focal plane
Solution: lens with occluder
Image of a
defocused point
light source
Aperture pattern
Object
Lens with coded
aperture
Camera
sensor
Point
spread
function
Focal plane
Solution: lens with occluder
Image of a
defocused point
light source
Aperture pattern
Object
Lens with coded
aperture
Camera
sensor
Point
spread
function
Focal plane
Solution: lens with occluder
Image of a
defocused point
light source
Aperture pattern
Object
Lens with coded
aperture
Camera
sensor
Point
spread
function
Focal plane
Why coded?
Coded aperture- reduce uncertainty in scale identification
Conventional
Larger scale
Correct scale
Smaller scale
Coded
Frequency
spectrum
0
Filter,
1st
scale
spectrum
=
1st observed image
0
Frequency
Frequency
0
Sharp Image
Frequency
0
spectrum
spectrum
Sharp Image
Filter, 2nd scale
spectrum
spectrum
Convolution- frequency domain representation
=
2nd observed image
0
Frequency
0
Spatial convolution
Frequency

Output spectrum has zeros
frequency multiplication
where filter spectrum has zeros
Estimated image
?
Frequency
spectrum
0
Filter, correct scale
spectrum
Observed image
0
Division by zero
Estimated image
?

Frequency
0
spectrum
=
spectrum
spectrum
Coded aperture: Scale estimation and division by zero
spatial ringing
0

Frequency
0

Frequency
Filter, wrong scale
=
Frequency
Estimated image
?
Frequency
spectrum
0
Filter, correct scale
spectrum
Observed image
0
0

Frequency
division of tiny value by zero
no spatial ringing

?
Frequency
Filter, wrong scale
0

Frequency
Estimated image
0
spectrum
=
spectrum
spectrum
Division by zero with a conventional aperture?

Frequency
=
Filter Design
Analytically search for a pattern maximizing discrimination
between images at different defocus scales (KL-divergence)
Account for image prior and physical constraints
Score
More discrimination
between scales
Less discrimination
between scales
Sampled aperture patterns
Conventional
aperture
Kullback-Leibler divergence
Non-symmetric measure of distance between 2 probabilities.
Closely related to entropy.
Use probability distribution in the frequency domain.
Filter Design Considerations
•Filters should be binary
•Easy to cut from material – No floating elements
•Avoid using the full aperture due to radial distortion in optics
•Minimum size on filter holes to avoid diffraction
Filter Design Optimization
Input:
•Binary 13x13 Patterns with 1mm features
•8 different scales vary between 5 to 15 pixels
Method:
•KL-divergence scores computed for pairs of varying scales of each
filter design
•Minimum score between pairs is assigned as score for filter
Result:
Score
More discrimination
between scales
Less discrimination
between scales
Sampled aperture patterns
Conventional
aperture
Zero frequencies- pros and cons
Previous methods:
0
0
Our solution:
0
Include zero frequencies:
No zero frequencies:
+
-
Filter can be easily inverted
Weaker depth discrimination
0
+
+
Zeros improve depth discrimination
Inversion difficult
Inversion made possible
with image priors
Depth results
Regularizing depth estimation
Try deblurring with 10 different aperture scales
x  arg min | f  x  y |2
Convolution error

_
  i  (xi )
Derivatives prior
2
+
Keep minimal error scale in each local window + regularization
Input
Local depth estimation
Regularized depth
Regularizing depth estimation
Local depth estimation
Input
Regularized depth
Sometimes, manual intervention
Input
Local depth estimation
Regularized depth
After user corrections
All focused results
Input
All-focused
(deconvolved)
Close-up
Original image
All-focus image
Input
All-focused
(deconvolved)
Close-up
Original image
All-focus image
Naïve sharpening
Comparison- conventional aperture result
Ringing due to wrong scale estimation
Comparison- coded aperture result
Application: Digital refocusing from a single image
Application: Digital refocusing from a single image
Application: Digital refocusing from a single image
Application: Digital refocusing from a single image
Application: Digital refocusing from a single image
Application: Digital refocusing from a single image
Application: Digital refocusing from a single image
Coded aperture: pros and cons
AND depth at a single shot
+ Image
+ No loss of image resolution
+ Simple modification to lens
- Depth is coarse
unable to get depth at untextured areas,
might need manual corrections.
+ But depth is a pure bonus
- Loss some light
+ But deconvolution increases depth of field
Deconvolution code available
http://groups.csail.mit.edu/graphics/CodedAperture/
50mm f/1.8:
$79.95
Cardboard:
$1
Tape:
$1
Depth acquisition:
priceless
Download