Table 1: Some Assumptions of the Visual Indexing Theory (FINST Theory)

(Taken with permission from Pylyshyn's VISUAL INDEXES, PRECONCEPTUAL OBJECTS, AND SITUATED VISION, to appear in special issue of Cognition on Objects and Attention)

(1) Primitive visual processes segment the visual field into feature-clusters in a preconceptual manner (independent of how they might be encoded, if at all). The ensuing clusters, referred to as primitive visual objects or proto-objects, are ones that tend to be reliably associated with distinct individual objects in the distal scene.

(2) Based on such properties as their local distinctiveness, these clusters are activated and compete for a finite (about 4 or 5) pool of internal pointers called visual indexes (or sometimes FINSTs). These indexes are assigned in a stimulus-driven manner: we say that indexes are "grabbed" by visual objects.

(3) Although assignment of indexes is stimulus-driven, there may be certain restricted ways in which cognition can influence this process. According to our hypothesis, properties of the objects in question are not encoded prior to their being indexed, so a cognitively controlled procedure cannot refer to the properties of the objects - including their locations - in selecting them for assignment. There may be, however, some restricted ways in which cognition can affect the assignment process. For example, focal attention might be scanned until it comes across an object, at which time an index may spontaneously get assigned to it. Whether other types of cognitively-driven strategies can also lead to the assignment of indexes is not know.

(4) An index keeps being bound to the same visual object as the object changes its properties and its location on the retina (within certain constraints). In fact this is the only thing that makes it count as the "same" visual object.

(5) It is an empirical question what kinds of (proximal and distal) patterns get bound to indexes. Within limits, patterns need not be spatially local and punctate, but they do need to be perceptual wholes, where that term needs to be empirically explicated.

(6) Only indexed objects can enter into subsequent cognitive processing: e.g., relational properties like INSIDE(x,y), ABOVE(x,y), COLLINEAR(x,y,z),... can only be evaluated and motor commands such as MOVE-ATTENTION(x) can only be executed if the arguments (x,y,z,...) are bound by indexes.