Understanding Gestalt Cues and Ecological Statistics using a Database of Human Segmented Images Charless Fowlkes, David Martin and Jitendra Malik Department of Computer Sciences, University of California, Berkeley {fowlkes,dmartin,malik}@eecs.berkeley.edu Egon Brunswik first suggested nearly 50 years ago that the various Gestalt factors of grouping made sense because they reflected the statistics of natural scenes[1]. For example, if points that are similar in color or luminosity are more likely to belong to the same object then it is appropriate to group them. We looked at two measures for quantifying the relative “power” of different segmentation cues. If we consider the classification task of deciding whether two pixels belong in the same or different segments, we can compute the Bayes risk associated with the optimal threshold for a given cue. Unfortunately, risk is uninformative when the Bayes optimal strategy is to declare all points as lying in different segments. R min P(error | x, thresh) P( x) thresh We are building a collection of human generated segmentations of natural images that is useful in quantifying the nature of cues such as similarity, proximity, and convexity. Having ground truth allows us to empirically observe the probability that two points in the image plane should belong to the same segment conditioned on some photometric property of the image such as the similarity in local intensity. Two points that are next to each other in the image plane are almost always members of the same segment but extended regions such as sky mean that the distribution has a heavy tail. x We also compute the mutual information between the same segment indicator and a given cue. Here we show the risk and mutual information conditioned on distance between the two points. P ( x, y ) I ( x; y ) P( x, y ) log P( x) P( y ) x, y Bayes Risk Mutual Info. Proximity 0.335 0.044 Luminance 0.369 0.016 Color 0.369 0.014 Intervening Contour 0.303 0.081 Texture 0.300 0.112 We operationalize convexity as the ratio between a segment’s area and the area of it’s convex hull. Convex regions are far more prevalent since images tend to consist of several convex foreground objects in front of a background segment. Since natural images contain texture and shading, identical luminance does not imply segment membership. Probability of lying in the same segment conditioned on both image plane and color space distance. YX color 0.926 gray 0.913 Due to the self-similar nature of images, we expect the distribution of region sizes to follow a power law over some range of scales. Our result here agrees with previous work in the area. Segmentations of color images tend to contain more small segments. [1] E. Brunswik, J. Kamiya, “Ecological validity of proximity and other Gestalt factors,” American Journal of Psychology, pp. 20-32, 1953 [2] D. Mumford, B. Gidas, “Stochastic Models for Generic Images,” Technical Report, Division of Applied Mathematics, Brown University, 1998. [3] D. L. Ruderman, “The Statistics of Natural Images,” Network, 5(4):517-548, 1994. [4] L. Alvarez, Y. Gousseau, J. Morel, “Scales in Natural Images and a Consequence on Their BV Norm”, Scale-Space Theories in Computer Vision, 1999. [5] W. Geisler, J. Perry, B. Super, D. Gallogly, “Edge Co-occurrence in Natural Images Predicts Contour Grouping Performance,” Vision Research, 41, 711-724, 2001. [6] J. August, S. Zucker, “The Curve Indicator Random Field: Curve Organization via Edge Correlation”, 265-287, in Perceptual Organization in Artificial Vision Systems, Boyer and Sarkar (eds.), Kluwer, 2000 [7] S. C. Zhu, “Embedding Gestalt Laws in Markov Random Fields,” IEEE Trans. On Pattern Analysis and Machine Intelligence, 21(11), Nov 1999.