Zenon Pylyshyn (Center for Cognitive Science and Department of Psychology)
Applying research on spatial organization to problems of cooperative work, telelearning and communication of knowledge.
Background: Psychological use of physical and mental space
A major cognitive framework for individuating, visualizing, and keeping track of different items of knowledge (such as who said what in a conference or what items of data go with what) is the use of real 3D spatial locations. We use space both literally (as in the desktop or office model of data organization) and also figuratively. Examples of the latter includes such techniques as mentally locating different facts and premises in certain imagined spatial loci -- a technique widely used in mnemonic aids, and the use of spatial location in reasoning where so-called "spatial paralogic" provides an important scheme for keeping track of different components of a problem. The use of distinct spatial loci in reasoning and visualizing can be enhanced and its effectiveness in communication increased if distinct spatial locations can be shared. This, of course, happens routinely when people use gestures, pointing, and carving shapes in the air when they converse. Sharing a common workspace and conceptual space is now becoming technologically feasible through the use of interactive multimedia workstations, in which sound and 3D visual locations can be communicated and spatial indicators such as pointing in space can form part of the human-computer interaction.
There is already considerable psychophysical evidence concerning the interaction of sound and vision in localization and identification as well as the interaction of these modalities with motor movements. However this evidence has not been brought to bear in the design of multimedia (or telepresence) workstations because such interactions between users and computers have heretofore not been technically feasible.
Although in many places high bandwidth links make it possible to use high-immersion displays, the introduction of space-based communication-enhancing information will have to be incremental and deal initially with what can be done with current low bandwidth multimodal communications networks to enhance communication in education, training, and cooperative work. For example, partly because of bandwidth limitations and the delays inherent in packet-switching technology, current teleconferencing and cooperative work systems designed to operate over digital networks (such as those supplied commercially) are extremely primitive and provide only very impoverished information about location of speakers and their gestures. This sort of information, however, could usefully be augmented by recognizing and highlighting which person is speaking (using sound-localization and speaker identification techniques already available) and even by automatically zooming in and providing additional bandwidth in the region of the speaker.
Initially the proposed research will deal with such questions as what spatial information (e.g. gestures, location of speakers, location of objects referred to, etc) is most important for communications and how to optimally encode such information for initially low-bandwidth channels. The recognition of gestures and their encoding and transmission represents a longer-term research program. However, the work that has already been carried out in the design of certain computerized choreography tools such as the LifeForms system at Simon Fraser University is an excellent step towards this goal since it provides both a compact model of the human form (for use in automatic recognition body orientation and for parameterizing this information for compact transmission) and also a compact representation of human movements and gestures. Used in conjunction with motion-capture techniques it could provide compact transmission of both gestures and directional information within a shared 3D space.
Benefits of this line of research
In addition to providing design principles for workstations and cooperative work/learning environments, this line of exploration feeds into the following application areas:
1) Communication and learning aids for the disabled
We know that the use of space and spatial metaphors in communication is universal (Lakoff & Johnston). It is also well-known that spatial location plays a major role in communication for the deaf, as is shown by the way sign languages deal with pronouns and anaphora. It is also a major framework by which blind people not only navigate but also remember and model their environment. Although congenitally blind persons can't be said to have "visual images" in the usual sense, they do have a very well developed representation of space that serves a very similar function. Indeed it is well-documented (Golden-Meadow & Gleitman) that children blind at birth use spatial terms exactly the way sighted children do, though without ever having experienced the referents visually.
2) Communication, training, education and work in distant places
The ability to show and manipulate objects, to use communicative gestures as well as words, to refer and point to places in a room where ideas and blackboards are located, and in general to utilize the spatial framework of a common room, is important to intimate working relations. But it is even more important when a common language and linguistic culture do not naturally exist. It is also important for establishing the sense of personal interaction that is eroded by long distance unimodal communication. Studies of such communication media as email and comparisons of voice and visual contact show that simply increasing bandwidth is not always the answer to greater apparent "intimacy" and that different media are best suited for different kinds of interactions. However for the kind of learning experiences that, say, children respond best to -- the most intimate and close connections between teacher and pupil is essential.
3) Closely related to telelearning is cooperative work-at-a-distance
Although teleconferencing has been a moderately active subject of research, the logical extension of this idea to task-oriented workshops which use multi-media and shared workspaces is in its infancy. Cooperative interactive work is becoming more and more important, especially where the workforce is widely dispersed. The more complex the combination of skills required for some piece of work, such as designing some artifact, the more important does the technology of multiple-authoring and multiple-person designing become. Design of such artifacts as novel computers requires the cooperative effort of many fields of expertise, as does the production of a multi-authored document. When people are present in the same room and work on a design together they not only talk vigorously, but they sketch (perhaps using a computer console and electronic pointing device), walk about the room, jot down notes on different boards, point to previous sketches, and may even physically examine a partially completed or roughly approximated three-dimensional object. None of these are available in tele-operation mode. However some of these interactions can in principle now be realized in high-immersion interactive workstations with a virtual shared environment. It is possible to provide several screens situated around a room, to encode and transmit gestures and sounds so as to preserve their 3-D locations and directions. The use of spatial location in this way would not only enhance communication, but would serve to distinguish and index different types of information. Participants could refer to different aspects of their discussion the way blind people sometimes do; by associating different ideas or aspects of the joint work with places in a common workspace.