Non-Local Characterization of Scenery Images: Statistics, 3D Reasoning, and a Generative Model Tamar Avraham and Michael Lindenbaum Technion Characterization of Scenery Images: Overview scenery images (LabelMe) manual segmentation and region annotation ● Statistical Characterization ● ● ● Rough shape of regions Relative location of regions Shape of boundaries ● 3D Reasoning ● Why are background contours horizontal? ● A Generative Model ● ● Provides a prior on scenery image annotation Generates image sketches, capturing the gist of scenery images Given the above segmentation (without texture), which region labeling is more likely ? ground trees mountain sand OR sky mountain sea rocks ? Property 1 : Horizontalness Most background objects exceed the image width Background objects are wide and of low height while foreground objects’ shape tend to be isotropic background objects : sky, mountain, sea, trees, field, river, sand, ground, grass, land, rocks, plants, snow, plain, valley, bank, fog bank, desert, lake, beach, cliff, floor foreground objects: all others Property 2: Order / Relative Location The relative top-bottom locations of types of background are often highly predictable sky mountain desert trees field The probability for a background region of identity A to appear above a background region with identity B, summarized in a histogram for various background identity pairs valley river lake land sea plants grass rocks sand ground Topological ordering of background identities can be defined: this DAG is associated with the reachability relation R : {(A,B)| p(A above B) > 0.7} Property 3: Boundary Shape The characteristics of the upper region’s contour correlate with the region’s identity A sample of contour segments associated with background object classes mountain, sea, and trees Chunks of upper boundaries as 1D signals: Curves associated with sea, grass or field resemble DC signals. Curves associated with trees and plants are high frequency signals. Curves associated with mountains resemble signals with low frequency and high amplitude 3D Reasoning: Why are background regions horizontal? Land regions whose contour tangents in aerial images are uniformly distributed appear with strong horizontal bias in images taken by a photographer standing on the ground Flatland ”Place a penny on the middle of one of your tables in Space ... look down upon it. It will appear a circle....gradually lower your eyes ... and you will find the penny becoming more and more oval to your view....” From Flatland, by Edwin A. Abbott, 1884 Θ - the set of tangent angles for contours in aerial images (relative to an arbitrary 2D axis on the surface) Θ’ - the set of angles that are the projections of the angles in Θ on the camera’s image plane θ Θ weed sand p flora soil lake grass A schematic illustration of an aerial image tan ' h tan z x tan p X1 X2 θ’ Θ’ X3 h x z An image is taken by a photographer standing on the ground The distribution Θ’, assuming Θ=U[0,180°), h~2[m], z~U[0,1000[m], x~U[0,500[m]] 3D Reasoning cont. Ground elevation and slope statistics Two landscape image contour types: 1) The contours between different types of regions on the terrain 2) The contours of mountains associated with occluding boundaries (e.g., skylines) 3D Reasoning cont. Ground elevation and slope statistics The contours between different types of regions on the terrain A point p lies on a boundary between land regions, located on an elevated surface with gradient angle ϕ. The plane is rotated at an angle ω relative to X1 axis P ( x, H , z ) X2 ω X3 tan ' X1 O H (cos sin sin cos cos) z sin sin x(cos sin sin cos cos) z(cos cos sin cos sin) Estimated terrain slope distribution using the IIASA-LUC dataset The distribution Θ’ assuming Θ =U[0,180°), ϕ~slope statistics, ω~[-90°,90°]. H’s distribution was estimated from sampling an elevation map in pair locations up to 9km apart 3D Reasoning cont. Ground elevation and slope statistics The contours of mountains associated with occluding boundaries Tangents in images bounded by the max-slope-over-land-regions statistics Estimated distribution of the maximum slope over land regions, each covering approx. 9 square kilometers * The paper also discussed the effect of land cover and points out other factors that should be considered in a more complete analysis. The Generative Model S h1 ,..., hn , S1 ,..., S n 1 h1 S1 h2 L {sky, ground, sea, trees, grass, plants, rocks,...} l (l1 ,..., ln ) Ln S2 S3 h3 h4 P (l | S ) P(l | S ) ? P ( S | l ) P (l ) n l 'L P ( S | l ') P (l ') P1 (l ) n 1 P (l (l1 ,..., ln )) M (' top ', l1 ) M (li , li 1 ) M (ln , ' bottom ') i 1 top trees ground P (h 2 i 1,..., n top-bottom order sky (the annotation) i | li ) i 1,..., n 1 region height A normal distribution for the height covered by each region type sea bottom upper boundaries P3 ( S i | li 1 ) modeled by PCA of “1D” signals The Generative Model: advantages The generative nature of the model makes it possible to: 1) Generate image sketches, capturing the gist of scenery images ECCV10 2) Obtain priors for region annotation more recent work The Generative Model: Training – Given a set of manually segmented and annotated scenery images: • Top-bottom order: estimate the transaction matrix M counting number of occurrences of the different ‘moves’. • Relative region coverage: estimate mean and variance for the relative average height of each type of region • Upper boundary: for each background region type, collect 64-pixels length chunks. Find the first k principle components and Eigen values so that 95% of the variation in the training set is modeled. ( , , ) – Possible to train different models for different scenery categories. here we trained together of 3 categories: coast, mountain, open country. top sky trees ground sea bottom The Generative Model: Generating Sketches – randomly selecting the top-bottom sequence by a random walk on the Markov network , starting at ‘top’, stopping at the sink ‘bottom’. – randomly select the relative average height of each region – randomly generate the boundaries: • For each Si generate 4 chunks Si ,m b , b j ~ N (0, j ) Sky sky mountain mountain mountain mountain trees trees Trees trees Si ,1 ,..., Si ,4 The Generative Model: Generated Results typical scenery images (LabelMe) manual segmentation and region annotation semantic sketches of scenery images generated by our model The Generative Model: (More) Generated Results Region Classification Q: Can the new cues contribute to region classification/annotation? only layout only color&texture sky mountain sea? ground? rocks?plants? + sky? sea? mountain? ground? sea rocks = sky mountain sea rocks A: Complimentary to textural & color cues Goal: to show that region classification using global + local descriptors is better than only local descriptors Region Classification - HMM H1 H2 T1 T2 H 3 T3 H4 H5 l1 T1 S2 S3 S4 T4 T5 S5 H1 ln li li 1 Ti 1 Si 1 H i 1 Ti Si Hi Marginals by the sum-product message passing algorithm Classification by max ci arg max( p i j ) L Lj Tn Lj pi Sn H n Region Classification - Discussion General object annotation and detection using context: G. Csurka and F. Perronnin. An efficient approach to semantic segmentation. IJCV, 2010. C. Desai, D. Ramanan, and C. Fowlkes. Discriminative models for multi-class object layout. ICCV, 2009. C. Galleguillos and S. Belongie. Context based object categorization:A critical survey. Comput. Vis. Image Understand, 2010. X. He, R. S. Zemel, and D. Ray. Learning and incorporating top-down cues in image segmentation. ECCV, 2006. S. Kumar and M. Hebert. A hierarchical field framework for unified context-based classification. ICCV, 2005. A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora,and S. Belongie. Objects in context. ICCV, 2007. J. Shotton, J. Winn, C. Rother, and A. Criminisi. Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling appearance, shape and context. IJCV, 81(1):2–23, 2009. Approximated inference needed (e.g., greedy iterative methods, loopy belief propagation) Background region classification of scenery images: a 1D problem Enables exact inference Region Classification - Details Textural & color features: as in Vogel&Scheile IJCV 07: HSV Color histograms Edge direction histograms Gray-level co-occurrences (GCLM, Haralick et al. 73). 4 offsets. For each, contrast, energy, entropy, homogeneity, inverse difference moment, and correlation. p (li L j | Ti )and p (li L j | Si ) are each modeled with a multiclass probabilistic SVM (LibSVM, Wu, in, Weng 04), RBF kernel. 5-fold cross validation at image level. Each training includes parameter selection by inter-training set cross validation. Dataset of 1144 images (LabelMe: coast, open country, mountains). Regions: sky mountain 1120 1489 sea 401 trees 622 field 366 river 150 sand ground grass land rocks plants snow plateau valley bank 182 94 36 41 201 143 50 28 20 20 lake 9 beach cliff Total Regions 3 4 4979 Region Classification – Results 1 Input image ground truth relative location boundary shape color&texture sky sky mountain sea mountain mountain sky sky mountain mountain sea mountain mountain field sky sky mountain sky mountain mountain sea sky mountain field sky field mountain mountain mountain sea sky mountain SKY mountain sea mountain mountain sea all cues sea sky mountain sea mountain sea sky sky mountain mountain sky mountain mountain sea sky mountain sky WATER mountain SKY mountain mountain mountain mountain field mountain MOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-SAND MOUNTAIN-SAND MOUNTAIN-SAND WATER WATER WATER sky sea sky field mountain mountain mountain field field SKYmountain SKY SKY field field field field field MOUNTAIN-TREES MOUNTAIN-TREES MOUNTAIN-ROCKS MOUNTAIN-TREES MOUNTAIN-ROCKS MOUNTAIN-TREES MOUNTAIN-TREESMOUNTAIN-TREES MOUNTAIN-TREESMOUNTAIN-PLANTS MOUNTAIN-ROCKSMOUNTAIN-PLANTS MOUNTAIN-ROCKS PLAIN-SAND WATER SKY WATER sky sky sky mountain mountain mountain mountain trees mountain trees SKY mountain SKY SKY sea sky sea MOUNTAIN-TREESMOUNTAIN-TREESSKY MOUNTAIN-TREES field MOUNTAIN-TREESSKY MOUNTAIN-TREES MOUNTAIN-TREES sand mountain mountain MOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-TREES MOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-TREES PLAIN-ROCKS MOUNTAIN-TREESMOUNTAIN-PLANTS MOUNTAIN-SNOW sky SKY trees sky sea mountain mountain sand PLAIN-SAND SKY Region Classification – Results 2 Input image ground truth relative location boundary shape color&texture all cues sky sky mountain field mountain mountain sky sky mountain sea mountain mountain field sand sky mountain sky mountain mountain sky mountain sea mountain mountain SKY mountain mountain sea sky mountain sea mountain field mountain sky mountain sea plants mountain sea mountain WATER sky mountain sand sky mountain SKY mountain mountain field sky sea sky MOUNTAIN-TREESMOUNTAIN-TREESMOUNTAIN-SAND MOUNTAIN-SAND MOUNTAIN-SAND PLAIN-SAND MOUNTAIN-TREESPLAIN-SAND sky sea sky mountain sea sea sea SKY MOUNTAIN-SAND SKY sand mountain sea sky sand mountain sky mountain mountain sand sky MOUNTAIN-TREES MOUNTAIN-TREESMOUNTAIN-SAND MOUNTAIN-SAND MOUNTAIN-TREESWATER PLAIN-SAND WATER sky sky sky PLAIN-SAND MOUNTAIN-SAND MOUNTAIN-PLANTS MOUNTAIN-SAND mountain mountain field mountain SKY MOUNTAIN-ROCKS MOUNTAIN-ROCKS MOUNTAIN-TREES MOUNTAIN-TREES SKY SKY mountain field field mountain mountain MOUNTAIN-SAND MOUNTAIN-TREES SKY mountain sky sky field trees trees sky trees trees bank sea bank river river sand bank bank MOUNTAIN-SNOW sky sky sky sky trees trees mountain mountain trees trees mountain mountain SKY mountain WATER SKY trees mountain mountain mountain mountain mountain trees mountain mountain field field mountain mountain MOUNTAIN-TREES WATER WATERtrees WATER trees mountain river mountain river mountain mountain trees trees MOUNTAIN-SAND MOUNTAIN-TREES MOUNTAIN-SAND MOUNTAIN-SAND field trees field trees mountain trees mountain trees sky sky mountain sky mountain mountain sky sky sky mountain SKY SKY mountain mountain MOUNTAIN-TREESMOUNTAIN-TREESPLAIN-GRASS mountain sea sky sea SKY mountain MOUNTAIN-TREES Region Classification – Results 3 Cue Accuracy Color&Texture Relative Location Boundary Shape 0.615 0.503 0.452 Relative Loc. + Boundary Shape Color&Texture + Relative Loc. Color&Texture + Boundary Shape 0.573 0.676 0.641 All (ORC) 0.682 19 categories Accuracy per class: Color&texture: higher accuracy for trees, field, rocks, plants, snow New cues: better for sky, mountain, sea, sand Other classes performance: very low due to their number. Discussion We achieved the goal of showing that the new cues improve texture&color only based region classification. Many classifications counted as errors are actually correct Related to recent work on object categorization with huge amount of categories (Deng, Berg, Li, Fei-Fei ECCV10, Fergus, Weiss, Torralba ECCV10) Work in progress. Summary Focus of characterization of scenery images Intuitive observations regarding the statistics of cooccurrence, relative location, and shape of background regions were explicitly quantified and modeled Some 3D reasoning Non-local properties can capture the gist of images Contextual background region classification with exact inferences. The new cues improve local-descriptors based region classification Future & General Discussion A better way to evaluate region classification: work in progress Use the layout cues for better top-down segmentation (Felzenszwalb&Veksler, CVPR 10). Shape prior to address “shrinking bias” (Vicente, Kolmogorov, Rother, CVPR 08) Use the layout cues to improve scene categorization Augment foreground objects into the model. Extend model to other domains. Use the cues to align pictures. Generated sketches as a basis for rendering. Scenery : too simple? Lets first succeed in understanding those images, following the biological visual system evolution Thank You For Your Time