Non-Local Characterization of Scenery Images: Statistics, 3D

advertisement
Non-Local Characterization of Scenery Images:
Statistics, 3D Reasoning, and a Generative
Model
Tamar Avraham and Michael Lindenbaum
Technion
Characterization of Scenery Images: Overview
scenery images (LabelMe)
manual segmentation and region annotation
● Statistical Characterization
●
●
●
Rough shape of regions
Relative location of regions
Shape of boundaries
● 3D Reasoning
●
Why are background contours horizontal?
● A Generative Model
●
●
Provides a prior on scenery image annotation
Generates image sketches,
capturing the gist of scenery images
Given the above segmentation (without texture),
which region labeling is more likely ?
ground
trees
mountain
sand
OR
sky
mountain
sea
rocks
?
Property 1 : Horizontalness
Most background objects exceed the image width
Background objects are wide and of low height while
foreground objects’ shape tend to be isotropic
background objects : sky, mountain, sea, trees, field, river, sand, ground, grass, land, rocks,
plants, snow, plain, valley, bank, fog bank, desert, lake, beach, cliff, floor
foreground objects: all others
Property 2: Order / Relative Location
The relative top-bottom locations of types of background are
often highly predictable
sky
mountain desert
trees field
The probability for a background region of
identity A to appear above a background region
with identity B, summarized in a histogram for
various background identity pairs
valley
river lake
land
sea
plants
grass
rocks
sand
ground
Topological
ordering of
background
identities can be
defined: this DAG is
associated with the
reachability relation
R : {(A,B)|
p(A above B) > 0.7}
Property 3: Boundary Shape
The characteristics of the upper region’s contour correlate with
the region’s identity
A sample of contour segments associated with background object classes mountain, sea, and trees
Chunks of upper boundaries as 1D signals: Curves associated with sea,
grass or field resemble DC signals. Curves associated with trees and
plants are high frequency signals. Curves associated with mountains
resemble signals with low frequency and high amplitude
3D Reasoning:
Why are background regions horizontal?
Land regions whose contour tangents in aerial images are uniformly distributed appear with
strong horizontal bias in images taken by a photographer standing on the ground
Flatland
”Place a penny on the middle of one of your tables in Space ... look down upon it. It will
appear a circle....gradually lower your eyes ... and you will find the penny becoming more
and more oval to your view....” From Flatland, by Edwin A. Abbott, 1884
Θ - the set of tangent angles for contours in aerial images (relative to an arbitrary 2D axis on the surface)
Θ’ - the set of angles that are the projections of the angles in Θ on the camera’s image plane
θ Θ
weed
sand

p
flora
soil
lake
grass
A schematic illustration
of an aerial image
tan '  h tan
z  x tan
p
X1
X2
θ’ Θ’
X3
h
x
z
An image is taken by a photographer
standing on the ground
The distribution Θ’, assuming
Θ=U[0,180°), h~2[m],
z~U[0,1000[m], x~U[0,500[m]]
3D Reasoning cont.
Ground elevation and slope statistics
Two landscape image contour types:
1) The contours between different types of regions on the terrain
2) The contours of mountains associated with occluding
boundaries (e.g., skylines)
3D Reasoning cont.
Ground elevation and slope statistics
The contours between different types of regions on the terrain
A point p lies on a boundary between land regions,
located on an elevated surface with gradient angle ϕ.
The plane is rotated at an angle ω relative to X1 axis

P ( x, H , z )

X2
ω
X3
tan ' 
X1
O
H (cos sin  sin cos cos)  z sin sin
x(cos sin  sin cos cos)  z(cos cos sin cos sin)
Estimated terrain slope distribution
using the IIASA-LUC dataset
The distribution Θ’ assuming Θ =U[0,180°),
ϕ~slope statistics, ω~[-90°,90°]. H’s distribution
was estimated from sampling an elevation map
in pair locations up to 9km apart
3D Reasoning cont.
Ground elevation and slope statistics
The contours of mountains associated with occluding boundaries
Tangents in images bounded by the
max-slope-over-land-regions statistics
Estimated distribution of the maximum slope over land
regions, each covering approx. 9 square kilometers
* The paper also discussed the effect of land cover and points out other factors that should be considered in a more complete analysis.
The Generative Model
S   h1 ,..., hn , S1 ,..., S n 1 
h1
S1
h2
L  {sky, ground, sea, trees, grass, plants, rocks,...}
l  (l1 ,..., ln )  Ln
S2
S3
h3
h4
P (l | S ) 
P(l | S )  ?
P ( S | l ) P (l )

n
l 'L
P ( S | l ') P (l ')
 P1 (l )
n 1
P (l  (l1 ,..., ln ))  M (' top ', l1 )  M (li , li 1 ) M (ln , ' bottom ')
i 1
top
trees
ground
 P (h
2
i 1,..., n
top-bottom order
sky
(the annotation)
i
| li )

i 1,..., n 1
region height
A normal distribution
for the height
covered by each
region type
sea
bottom
upper boundaries
P3 ( S i | li 1 )
modeled by PCA of “1D” signals
The Generative Model: advantages
The generative nature of the model makes it
possible to:
1) Generate image sketches, capturing the
gist of scenery images
 ECCV10
2) Obtain priors for region annotation
 more recent work
The Generative Model:
Training
– Given a set of manually segmented and
annotated scenery images:
• Top-bottom order: estimate the transaction matrix M
counting number of occurrences of the different
‘moves’.
• Relative region coverage: estimate mean and variance
for the relative average height of each type of region
• Upper boundary: for each background region type,
collect 64-pixels length chunks. Find the first k principle
components and Eigen values so that 95% of the
variation in the training set is modeled. (  ,  ,  )
– Possible to train different models for different
scenery categories.
here we trained together of 3 categories: coast,
mountain, open country.
top
sky
trees ground sea
bottom
The Generative Model:
Generating Sketches
– randomly selecting the top-bottom sequence by a random
walk on the Markov network , starting at ‘top’, stopping at
the sink ‘bottom’.
– randomly select the relative average height of each region
– randomly generate the boundaries:
• For each
Si
generate 4 chunks
Si ,m    b , b j ~ N (0,  j )
Sky
sky
mountain
mountain
mountain
mountain
trees
trees
Trees
trees
Si ,1 ,..., Si ,4
The Generative Model:
Generated Results
typical scenery images (LabelMe)
manual segmentation and region annotation
semantic sketches of scenery images generated by our model
The Generative Model:
(More) Generated Results
Region Classification
Q: Can the new cues contribute to region classification/annotation?
only layout
only color&texture
sky
mountain
sea? ground?
rocks?plants?
+
sky? sea?
mountain? ground?
sea
rocks
=
sky
mountain
sea
rocks
A: Complimentary to textural & color cues
Goal: to show that region classification using global + local
descriptors is better than only local descriptors
Region Classification - HMM
H1
H2
T1
T2
H 3 T3
H4
H5
l1
T1
S2
S3
S4
T4
T5
S5
H1
ln
li
li 1
Ti 1 Si 1 H i 1
Ti
Si
Hi
Marginals by the sum-product message passing algorithm
Classification by max
ci  arg max( p i j )
L
Lj
Tn
Lj
pi
Sn H n
Region Classification - Discussion
General object annotation and detection using context:
G. Csurka and F. Perronnin. An efficient approach to semantic segmentation. IJCV, 2010.
C. Desai, D. Ramanan, and C. Fowlkes. Discriminative models for multi-class object layout. ICCV, 2009.
C. Galleguillos and S. Belongie. Context based object categorization:A critical survey. Comput. Vis. Image Understand, 2010.
X. He, R. S. Zemel, and D. Ray. Learning and incorporating top-down cues in image segmentation. ECCV, 2006.
S. Kumar and M. Hebert. A hierarchical field framework for unified context-based classification. ICCV, 2005.
A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora,and S. Belongie. Objects in context. ICCV, 2007.
J. Shotton, J. Winn, C. Rother, and A. Criminisi. Textonboost for image understanding: multi-class object recognition
and segmentation by jointly modeling appearance, shape and context. IJCV, 81(1):2–23, 2009.
Approximated inference needed (e.g., greedy iterative methods,
loopy belief propagation)
Background region classification of scenery images: a 1D problem
Enables exact inference
Region Classification - Details
Textural & color features: as in Vogel&Scheile IJCV 07:
HSV Color histograms
Edge direction histograms
Gray-level co-occurrences (GCLM, Haralick et al. 73).
4 offsets. For each, contrast, energy, entropy, homogeneity, inverse difference moment, and
correlation.
p (li  L j | Ti )and p (li  L j | Si ) are each modeled with a
multiclass probabilistic SVM (LibSVM, Wu, in, Weng 04), RBF kernel.
5-fold cross validation at image level. Each training includes
parameter selection by inter-training set cross validation.
Dataset of 1144 images (LabelMe: coast, open country, mountains).
Regions:
sky mountain
1120
1489
sea
401
trees
622
field
366
river
150
sand ground grass land rocks plants snow plateau valley bank
182
94
36
41 201
143
50
28
20
20
lake
9
beach cliff Total Regions
3
4
4979
Region Classification – Results 1
Input image
ground truth relative location boundary shape color&texture
sky
sky
mountain
sea
mountain
mountain
sky
sky
mountain
mountain
sea
mountain
mountain
field
sky
sky
mountain
sky
mountain
mountain
sea
sky
mountain
field
sky
field
mountain
mountain
mountain
sea
sky
mountain
SKY
mountain
sea
mountain
mountain
sea
all cues
sea
sky
mountain
sea
mountain
sea
sky
sky
mountain
mountain
sky
mountain
mountain
sea
sky
mountain
sky
WATER
mountain
SKY
mountain
mountain
mountain
mountain
field mountain
MOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-SAND MOUNTAIN-SAND
MOUNTAIN-SAND
WATER
WATER
WATER
sky
sea
sky
field
mountain
mountain
mountain
field
field
SKYmountain
SKY
SKY
field
field
field
field
field
MOUNTAIN-TREES
MOUNTAIN-TREES
MOUNTAIN-ROCKS
MOUNTAIN-TREES
MOUNTAIN-ROCKS
MOUNTAIN-TREES
MOUNTAIN-TREESMOUNTAIN-TREES
MOUNTAIN-TREESMOUNTAIN-PLANTS
MOUNTAIN-ROCKSMOUNTAIN-PLANTS
MOUNTAIN-ROCKS
PLAIN-SAND
WATER
SKY
WATER
sky
sky
sky
mountain
mountain
mountain
mountain
trees
mountain
trees
SKY mountain
SKY
SKY
sea
sky
sea
MOUNTAIN-TREESMOUNTAIN-TREESSKY
MOUNTAIN-TREES
field
MOUNTAIN-TREESSKY
MOUNTAIN-TREES
MOUNTAIN-TREES
sand
mountain
mountain
MOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-TREES
MOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-TREES
PLAIN-ROCKS
MOUNTAIN-TREESMOUNTAIN-PLANTS
MOUNTAIN-SNOW
sky
SKY
trees
sky
sea
mountain
mountain
sand
PLAIN-SAND
SKY
Region Classification – Results 2
Input image
ground truth relative location boundary shape color&texture all cues
sky
sky
mountain
field
mountain
mountain
sky
sky
mountain
sea
mountain
mountain
field
sand
sky
mountain
sky
mountain
mountain
sky
mountain
sea
mountain
mountain
SKY
mountain
mountain
sea
sky
mountain
sea
mountain
field
mountain
sky
mountain
sea
plants
mountain
sea
mountain
WATER sky
mountain
sand
sky
mountain
SKY
mountain
mountain
field
sky
sea
sky
MOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-SAND MOUNTAIN-SAND
MOUNTAIN-SAND
PLAIN-SAND
MOUNTAIN-TREESPLAIN-SAND
sky
sea
sky
mountain
sea
sea
sea
SKY
MOUNTAIN-SAND SKY
sand
mountain
sea
sky
sand
mountain
sky
mountain
mountain
sand
sky
MOUNTAIN-TREES
MOUNTAIN-TREESMOUNTAIN-SAND
MOUNTAIN-SAND
MOUNTAIN-TREESWATER
PLAIN-SAND
WATER
sky
sky
sky
PLAIN-SAND
MOUNTAIN-SAND MOUNTAIN-PLANTS
MOUNTAIN-SAND
mountain
mountain
field
mountain
SKY
MOUNTAIN-ROCKS
MOUNTAIN-ROCKS
MOUNTAIN-TREES
MOUNTAIN-TREES
SKY
SKY
mountain
field
field
mountain
mountain
MOUNTAIN-SAND
MOUNTAIN-TREES
SKY
mountain
sky
sky
field
trees
trees
sky
trees
trees
bank
sea bank
river
river
sand
bank
bank
MOUNTAIN-SNOW
sky
sky
sky
sky
trees
trees
mountain
mountain
trees
trees
mountain
mountain
SKY mountain
WATER
SKY trees
mountain
mountain
mountain
mountain
mountain
trees
mountain
mountain
field
field
mountain
mountain
MOUNTAIN-TREES
WATER
WATERtrees
WATER
trees
mountain
river
mountain
river
mountain
mountain
trees
trees
MOUNTAIN-SAND
MOUNTAIN-TREES
MOUNTAIN-SAND
MOUNTAIN-SAND
field
trees
field
trees
mountain
trees
mountain
trees
sky
sky
mountain
sky
mountain
mountain
sky
sky
sky
mountain
SKY
SKY
mountain
mountain
MOUNTAIN-TREESMOUNTAIN-TREESPLAIN-GRASS
mountain
sea
sky
sea
SKY
mountain
MOUNTAIN-TREES
Region Classification – Results 3
Cue
Accuracy
Color&Texture
Relative Location
Boundary Shape
0.615
0.503
0.452
Relative Loc. + Boundary Shape
Color&Texture + Relative Loc.
Color&Texture + Boundary Shape
0.573
0.676
0.641
All (ORC)
0.682
19 categories
Accuracy per class:
Color&texture: higher accuracy for trees, field, rocks, plants, snow
New cues: better for sky, mountain, sea, sand
Other classes performance: very low due to their number.
Discussion
We achieved the goal of showing that the new cues improve texture&color only
based region classification.
Many classifications counted as errors are actually correct
Related to recent work on object categorization with huge amount of categories
(Deng, Berg, Li, Fei-Fei ECCV10, Fergus, Weiss, Torralba ECCV10)
Work in progress.
Summary
Focus of characterization of scenery images
Intuitive observations regarding the statistics of cooccurrence, relative location, and shape of background
regions were explicitly quantified and modeled
Some 3D reasoning
Non-local properties can capture the gist of images
Contextual background region classification with exact
inferences.
The new cues improve local-descriptors based region
classification
Future & General Discussion
A better way to evaluate region classification: work in progress
Use the layout cues for better top-down segmentation
(Felzenszwalb&Veksler, CVPR 10). Shape prior to address
“shrinking bias” (Vicente, Kolmogorov, Rother, CVPR 08)
Use the layout cues to improve scene categorization
Augment foreground objects into the model. Extend model to
other domains.
Use the cues to align pictures.
Generated sketches as a basis for rendering.
Scenery : too simple?
Lets first succeed in understanding those images, following the
biological visual system evolution
Thank You
For Your Time
Download