Understanding Gestalt Cues and Ecological Statistics using a Database of...

advertisement
Understanding Gestalt Cues and Ecological Statistics using a Database of Human Segmented Images
Charless Fowlkes, David Martin and Jitendra Malik
Department of Computer Sciences, University of California, Berkeley
{fowlkes,dmartin,malik}@eecs.berkeley.edu
Egon Brunswik first suggested nearly
50 years ago that the various Gestalt
factors of grouping made sense
because they reflected the statistics of
natural scenes[1]. For example, if
points that are similar in color or
luminosity are more likely to belong to
the same object then it is appropriate to
group them.
We looked at two measures for quantifying the relative “power” of different segmentation cues.
If we consider the classification task of deciding whether two pixels belong in the same or different
segments, we can compute the Bayes risk associated with the optimal threshold for a given cue.
Unfortunately, risk is uninformative when the Bayes optimal strategy is to declare all points as lying in
different segments.
R  min  P(error | x, thresh) P( x)
thresh
We are building a collection of human generated segmentations of natural images that
is useful in quantifying the nature of cues such as similarity, proximity, and convexity.
Having ground truth allows us to empirically observe the probability that two points in
the image plane should belong to the same segment conditioned on some photometric
property of the image such as the similarity in local intensity.
Two points that are next to each other in
the image plane are almost always
members of the same segment but
extended regions such as sky mean that
the distribution has a heavy tail.
x
We also compute the mutual information between the same segment indicator and a given cue. Here we show the risk and mutual information
conditioned on distance between the two points.
P ( x, y )
I ( x; y )   P( x, y ) log
P( x) P( y )
x, y
Bayes Risk Mutual Info.
Proximity
0.335
0.044
Luminance
0.369
0.016
Color
0.369
0.014
Intervening Contour
0.303
0.081
Texture
0.300
0.112
We operationalize convexity as the ratio between a
segment’s area and the area of it’s convex hull.
Convex regions are far more prevalent since images
tend to consist of several convex foreground objects
in front of a background segment.
Since natural images contain
texture and shading, identical
luminance does not imply
segment membership.
Probability of lying in the
same segment conditioned
on both image plane and
color space distance.
YX

 color  0.926
 gray  0.913
Due to the self-similar nature of images, we expect
the distribution of region sizes to follow a power
law over some range of scales. Our result here
agrees with previous work in the area.
Segmentations of color images tend to contain
more small segments.
[1] E. Brunswik, J. Kamiya, “Ecological validity
of proximity and other Gestalt factors,” American
Journal of Psychology, pp. 20-32, 1953
[2] D. Mumford, B. Gidas, “Stochastic Models
for Generic Images,” Technical Report, Division
of Applied Mathematics, Brown University, 1998.
[3] D. L. Ruderman, “The Statistics of Natural
Images,” Network, 5(4):517-548, 1994.
[4] L. Alvarez, Y. Gousseau, J. Morel, “Scales in
Natural Images and a Consequence on Their BV
Norm”, Scale-Space Theories in Computer
Vision, 1999.
[5] W. Geisler, J. Perry, B. Super, D. Gallogly,
“Edge Co-occurrence in Natural Images Predicts
Contour
Grouping
Performance,”
Vision
Research, 41, 711-724, 2001.
[6] J. August, S. Zucker, “The Curve Indicator
Random Field: Curve Organization via Edge
Correlation”,
265-287,
in
Perceptual
Organization in Artificial Vision Systems, Boyer
and Sarkar (eds.), Kluwer, 2000
[7] S. C. Zhu, “Embedding Gestalt Laws in
Markov Random Fields,” IEEE Trans. On Pattern
Analysis and Machine Intelligence, 21(11), Nov
1999.
Download