Perceptual grouping and perceptual organization

How does the visual system organize an image into "objects?"

Objects. The visual image starts as an inchoate jumble of visual data, but the phenomenal visual world consists of coherent units, things, and "objects." How does the visual system figure out what the units or objects are? What are visual objects, exactly? It turns out you can think of "objects" in terms of the way the visual image is hierarchically organized after it has been parsed by the visual system. The visual image is organized hierarchically, like a tree, with more global relations near the root of the tree, and more local relations further down towards the leaves (note: upside-down tree!). In several papers (here and, more briefly, here) we have shown how the tree can be decomposed into subtrees, some of which are in a mathematical sense maximally internally coherent while being maximally disjoint from the rest of the tree—i.e., the objects.

Computational approaches to Gestalt grouping. Perceptual grouping is an inherently difficult problem from a computational point of view, in part due to the difficulty in defining exactly what a "good" group should consist of. The Gestalt psychologists famously offered many principles that subjective groups tend to obey, such as proximity, similarity, good continuation (see "contour integration" below), and the ultimate über-principle Prägnanz, meaning roughly "goodness of form." What, if any, is the underlying motivation common to all these principles? In several papers (here and here), we have developed the idea that perceptual grouping can be thought of as a kind "inference to the best explanation", in which the system adopts the model of the image that is in a certain sense formally minimal or maximally "generic"—essentially, the simplest or most regular interpretation consistent with the image data, called the Minimal Model. The minimal model of a dot configuration corresponds fairly closely with the way one subjectively organizes the configuration.

Contour integration. One of the central problems in perceptual grouping is contour integration, in which isolated visual elements are grouped together to form long, smooth chains or contours. This process was attributed by the Gestaltists to a preference for "good continuation," but this is simply a label for an organizing tendency that is difficult to either quantify or motivate. We have developed the idea that one can think of contours in the world as curvilinear stochastic processes that generate visual elements under a probabilistic generative model—in essence, that curves generate dots under a particular probability distribution. This generative model is, in effect, the brain's model of a "smooth curve." Then using Bayes' rule, we can do inference and decide whether a particular configuration of visual elements was likely to have been generated by a common, coherent contour in the world. This paper proposes the basic idea of the stochastic curve model, this paper investigates how the human observers combine probabalistic inferences along a chain of dots, and this paper puts it all together into a Bayesian theory of contour integration.

Links:

A short paper on visual objects, and a longer one

A shorter paper on Minimal Model theory, and a longer one

• Papers developing a Bayesian approach to contour integration, covering the 3-element case, the 4- and 5-element cases, and the general case.

A paper discussing how Minimal Model theory relates to Bayesian theory