Seeing: It's Not What You Think
Lecturer: Prof. Zenon Pylyshyn
Seminar in Cog Sci I: 16:185:600:01, Index #28759S
Meets on Thursdays, 9:50am to 12:30pm, in the RuCCS Playroom, A139.
Cross-listed as Psychology 16:830:637
This course will deal with some common misconceptions about the nature of visual perception and how it relates to cognition and to mental imagery. It will also deal with how visual representations connect with the world; how they get “grounded”. The title is deliberately ambiguous: It refers to the misconceptions many people have about what vision does (misconceptions that arise from the phenomenology of seeing), and it also refers to the thesis that vision is distinct from cognition (the modular view of visual perception). A large part of the class this year will consider the question of whether the visual system is involved in mental imagery and if so, what this might tell us about the nature of mental images.
The seminar will cover four main topics:
- The (mistaken) view that vision is a process that at some stage constructs an internal image, which includes pictorial details.
- The (mistaken) view that what we see is largely a function of what we know and expect, so that vision is continuous with cognition.
- A discussion of how visual apprehension and objects in the world connect, which will get us into a discussion of the nature of object-based attention and of Visual Index Theory and the so-called "situated" view of vision.
- A detailed critical analysis of the proposal that reasoning using mental images involves the visual system and therefore that a mental image must be something that we "see".
The reading for this seminar consist of a draft of my book-in-progress (with the same title as the course) and some additional readings, including an in-press BBS article on mental imagery. These will be distributed in class and will also be available on the Web.
Spring and Fall Fashions in Cognitive Science!
Rutgers Center For Cognitive Science
This is indeed an auspicious time for Cognitive Science. I stand here before you this evening as the first Chair to give a presidential address to this austere body, to place on record before you what you are to accept as the Society's official view on the new science of the mind. This is a particularly important period in the history of the Cognitive Science Society. It is the eighth anniversary of the founding of this society. Many of you may know that the society was founded eight years ago in a historic hotel room at the Kansas City Airport, which I believe has now been converted into a national historical monument. It was there that we founded this society and announced to the world the dawning of a new science, a science that would soon unravel the mysteries of the mind and reveal the secrets of human nature. A science that might one day even be turned upon itself to answer the question; Why did 12 reasonably intelligent people, backed by the boundless resources of the Alfred P. Sloan Foundation, choose to travel thousands of miles to meet at the Kansas City Airport?
This eighth anniversary is a particularly important one. Although other societies frequently pay more attention to the tenth, twentieth, and so on anniversaries, we -- of all people -- understand that this is a chauvinistic view that arises from the mere contingent fact that humans happen to have ten fingers -- more or less. Dedicated as we are to transcending the merely anthropocentric viewpoint in our search for Cognitive universals, natural constraints and timeless invariants, we are well aware that in an absolute notation -- such as one based on the prime factor decomposition -- eight is a much more important number, since it decomposes into three multiples of the lowest prime know to Cognitive Science. This, then, is the day we celebrate our first triple-prime anniversary. At that first meeting in, as I said before -- the Kansas City Airport, an attempt was made to represent all potentially relevant constituencies. Like Noah's arc we packed the airport hotel room with pairs of linguists, psychologists, computer scientists, philosophers, and even a lone anthropologist (I leave it to you to speculate on the deeper implications of there being only one anthropologist in this arc). This ecumenism, I believe, was one of our first fateful mistakes, since it started the field off in a direction of permissiveness and libertarianism from which it may never recover. One consequence is that an unusually large proportion of the membership of the society is dedicated to promoting the view that the field which the society represents does not exist. As I find myself very often rubbing shoulders at conferences with many of these detractors, I sometimes feel like one of the ministers in this New Yorker cartoon.
Now, on its eighth anniversary, the Cognitive Science Society -- if not the field has matured to the point where it can offer its chairman a serious recompense for using his name on its letterhead. The reward is that the chairman is offered a captive and appreciative audience which is committed to sit still for an hour or so while the chairman holds forth on matters that would normally be considered too ill mannered to present in public. Needless to say, this talk has not been refereed, although I have checked my facts over several times for sincerity.
Until a few years ago the way to earn your credentials in cognitive science -- the way to publish in one of the mainstream cognition journals -- was simply to run a reaction time study that yielded a straight line -- any straight line would do, it didn't matter what the independent variable was. Now I must confess that I have not been notably successful in meeting this challenge, although I have generated a great deal of very reliable data -- a fragment of which I will now show you to prove that I am no mere armchair cognitive scientist.
As you can see this slide shows that I was able to measure reaction time for a group of subjects while manipulating various independent variables (it doesn't matter which, the results remain pretty much the same). I know that it might seem to some of you who are not experienced in such matters that these data are not clean and precise. However, much depends on how you look at it. For example, in the next slide I show you that if you look closely at the data, they turn out to be extremely clean and precise.
Of course it's only when you do serious statistical analyses of the raw data on a powerful computer that you really see the significance of this work. We have over the years replicated these results in our laboratory over and over again, and we have been able to check them against the unpublished results of other investigators, so that now we are morally certain of the conclusion to which they lead us. And that conclusion is this: for normal populations of college undergraduates, when you carefully control for experimental artifacts the null hypothesis is always empirically true.
In case you are one of those of people who managed to get a straight line reaction- time result, let me assure you that I am not questioning your honesty. There have literally been thousands of reaction time experiments with linear outcomes. For example, there have been hundreds of variations on the Sternberg rapid memory scan experiment, hundreds more on mentally rotating images, mentally comparing magnitudes, and scanning mental images, as well as visually scanning displays for various features. All have been done with the utmost integrity. Indeed, notwithstanding the results I showed you earlier, I have to admit that once even I obtained a reliable linear reaction-time function that I will tell you about. This result illustrates what I suspect is behind many of the linear results obtained in the literature -- it's certainly what's behind all the mental image scanning results.
What I did was this. Following some work carried out in France about thirty years ago by Fraise, I asked subjects to generate a reaction time proportional to various magnitudes I showed them or told them about -- such as length of lines, size of numbers, and so on. I found that subjects could do this very well. They readily generated reaction-times that were a linear function of some magnitude or other that they had in mind. This demonstrates clearly that subjects can produce beautifully linear reaction time functions if you just explain to them clearly that this is what you want and if you ask them nicely and politely to do so. And conversely, as many of you who have heard me talk about certain mental imagery experiments know, we can also make many linear reaction-time functions go away by using this same clever procedure of persuading subjects that there should be no linear relation. It seems that in their zeal to be laboratory experimentalists, psychologists have forgotten that, within fairly wide margins, subjects can do whatever they like. This, by the way, is not unrelated to an even deeper message about cognition I shall deliver later in this talk.
Incidentally, psychologists are not the only cognitive scientists who fall victim to the cleverness of their subjects. Consider the plight of the poor anthropologists:
I don't want to leave you with the impression that finding linear reaction-time functions has been the only way of discovering truths about the mind. To illustrate the wide range of methods that have been developed for investigating cognition, I will mention a few randomly chosen examples of some classical methods that have been popular in certain quarters of cognitive science. For instance, there was the form of investigation shown in the next slide, which begins with a set of sentences, some of which have asterisks in front of them, and which ends with the conclusion that language is innate. One might call this method Chomsking to conclusions.
As another example, in the area of visual perception the old method of becoming famous by discovering an illusion still continues. But real fame and fortune is gained by those who produce a picture that people can use in order to argue their favorite theory. For example, take the old top-down versus bottom-up argument. This cute teddy bear: Marr’s Teddy bear has inspired-many people to view vision as being a bottom up process which begins with the detection of zero-crossings in the second derivative of the intensity function, whereas this equally attractive Dalmatian is held out by others as showing that vision must be top-down. The debate over the direction of perceptual processing is another of those arguments that goes back and forth in a cyclical manner. No sooner does one side demonstrate a strong top down effect (in which the conceptual "big picture" determines what is seen, as metaphorically shown in this slide top-down) than the other group shows that this process can be undermined by bottom-up data-driven processes, as metaphorically shown in this next slide bottom-up.
Yet another example comes from the study of problem solving, an area that has made great strides in the past 15 years. The growth of this area is due in part to the work in Artificial Intelligence, which has provided much of the theoretical framework. But in part it is also due to the invention of a devilishly clever methodology for discovering how a subject solves a problem. This technique consists of asking the subject. It is best if you do it while the subject is solving the problem, because then he tends to make a lot of mistakes which require a cognitive scientist to explain (that's because, you see, when the subject gets it right this is the province of Al). This method is called protocol analysis and is very important now that the business world has discovered "expert systems", which are computer programs that make some of the same mistakes as people, only faster. This obviously requires a careful analysis of the expert's mistakes.
Over the past decade Cognitive Science has made considerable progress not only in methodology, but also in its general approach to research and in the way it selects worthwhile research projects. We have all come a long way from the old days when we spent our time searching for straight-line reaction time functions or trying to demonstrate that perception was top-down or bottom-up. Today we are beginning to realize the deeper significance and far-reaching implications of our research. We see that research such as that which I discussed earlier can be used not only to argue for nativism or modularity, but also to argue against American foreign policy or against the Unix operating system. We have gone beyond the laboratory and are reaching out towards some of that loose Al venture capital by recognizing that our research is contributing to an understanding of the "user interface" and what makes it "user-friendly", or to improving the lot of "knowledge-engineers" by helping them enhance their "knowledge-acquisition" procedures. No longer is the ambition of new graduate students limited to the hope that they will successfully complete 3 experiments and a 5 page LISP program -- that well-known formula for getting published in Cognitive Psychology. Today they are out there interviewing experts in every walk of life to discover their "knowledge-structures", or are trying to teach 3 year olds to program in LISP, because they know that at the end of that road lies user-friendly private enterprise: Not only Teknowledge, but Tekmoney!
But I'm not being fair. Of course, not everyone is tempted by such pecuniary motives. Many people just want to be invited to conferences in the south of France. Others have purely intellectual motives. For example, some would like to show that cognition can be studied in the same esoteric macho manner as solid state physics. That's why we are seeing such a flurry of exotic mathematics recently. Math is back in as a symbol of virility -- as I believe Gary Larsen was first to recognize.
Technical tools have always been important in the development of science, and I do not wish to minimize their role. But sometimes the application is premature: either the tools are not ready, or the field is not ready to use them, as was the case, I understand, with the early microscope, as shown in this slide:
It is also true in cognitive science, as it has been in all other areas of human endeavour, that a very important reason for doing brilliant work is to show up the great people who have gone before you -- the Chomskys or Minskys or Newells and Simons. Of course the real big-fish target these days is John Von Neuman. Sociologically-speaking, the great break-through of the past decade has been the ganging up on Von Neuman -- or at least on the Von Neuman computer. Nobody knows exactly what makes a computer Von Neumanish, but there is general agreement that whatever it is, it is a bad thing. What is a good thing nowadays is to be parallel. Not just plain ordinary multi-processor parallel, but massively parallel. Once again nobody quite knows what that is, and whether it is to be contrasted with, say, being massively serial, but we do know that it is a good thing. Just why it is generally thought to be a good thing is a topic to which I will return shortly. But first I need to back away from such specifics in order to exercise my chairman's prerogative to be profound. I begin, therefore, with some background profundities.
In its formative years a discipline is very sensitive to the implicit foundational and philosophical assumptions of its practitioners. Moreover, it is also very susceptible to the whims of fashion and to the ever-changing styles of theory and methodology. Despite the proliferation of fashions I believe there are only a small number of fundamentally different options available. What I would like to do in the remainder of this brief time I have been awarded, is to tell you what I think are the main options and to try to set out where some of us stand on these options and why.
Distinctions are very important in an evolving science. Whether or not you start off with the assumption that natural versus violent motion is a fundamental distinction makes a big difference to your research program, as does the precise way you distinguish among weight, mass, and inertia. The reason that such distinctions are important is that you can only have a uniform scientific discipline of X if X is a natural domain -- if it includes phenomena that fall under some reasonably uniform set of principles.
The distinctions that are of fundamental importance to Cognitive Science are not, in my opinion, those between serial and parallel processing, or between continuous and discrete operations, Von Neuman, Non-Von Neuman or production system architectures. These are all real enough distinctions, and which side wins out will make a difference, inasmuch as one side will be right and the other will be wrong. But these are all local differences: it will surely turn out that someprocesses are serial and some parallel, some continuous and some discrete, some Von Neumanish and some not, etc. Each particular case will be decided empirically on its merit.
But what really is fundamental is a distinction with which scholars have had a love- hate relation ever since they first thought about cognition. It is a distinction that has been studied and debated by nearly a century of philosophers, has been made scientifically respectable by computer science, and yet continues to arouse strong emotional reactions in almost everyone. Concerted opposition to it has spawned several influential schools of psychology, including behaviorism and Gibson's "direct realism". More recently it has been responsible for the growth in popularity of the so-called "new connectionism". It is a distinction which is denied daily by a growing number of otherwise reasonable people, and yet a distinction which -- in their heart -- everyone accepts. If this distinction turns out to be defensible -- and indeed if it turns out to be indispensable, as I believe it is -- it will mean that large parts of cognition will require explanations that differ in a fundamental way from explanations in other natural sciences. Having set myself up in this way, I now offer my official diagnosis of "what ails cognitive science" in terms of this distinction.
The distinction I have in mind is between two ways of explaining an observed regularity in the behavior of some system. One way is by appealing to properties or mechanisms that are intrinsic to the system, in other words, by appealing to certain functional properties or capacities of the system (what many of us have called its functional architecture). The other is by appealing to the existence of particular representations which the system manipulates according to certain rules (for example, rules of inference).
I will not discuss the distinction in any detail today, partly because I want to get to the point about why it's a problem for recent approaches to Cognitive Science, and partly because then you won't feel any need to buy my book which is on sale out in the hall at the MIT table. It's only $9.95 for the paperback version, with a substantial discount if you promise to believe what it says. I just want to point out, using a few simple examples, that the idea of understanding certain behaviors in terms of goals and beliefs, and the semantics of stimulus events, is so widely accepted that even the most doctrinaire behaviorist believes it. Why else would he solicit subjects by writing sentences on notices -- sentences which subjects have never before seen and which assert such propositions as that volunteers will be paid for participating. And of course the behaviorist implicitly assumes that there will be a rational connection between the content of the experimental instructions, the subject's desire to do as he is told, and the subject's behavior in the experiment.
It would also be absurd to deny, say, that part of the explanation for why I am here making these particular noises has something to do with the fact that Chuck Clifton invited me and promised that my food and lodgings would be provided, and has something to do with my desire to promulgate a certain view that I hold about Cognitive Science. Even if you don't believe the view I hold about Cognitive Science, you doubtlessly believe that it is correct to say that I believe it and that I am at the moment trying to persuade you of it! Moreover, not only is there a rational connection between such things as invitations, my beliefs, etc and my behavior, but the particular behavior I exhibited could have been quite different if I had interpreted the antecedent events differently. For example, my behavior would have been different -- yet still rationally connected to my beliefs -- if Jim Moyer, the local organizer, had phoned me after the invitation had been issued and had said that Chuck had flipped out from overwork and was inviting everybody he could get hold of to give the after dinner speech, and moreover the entire budget of the society had already been spent. The regularity connecting stimulus events with behavior is subject to systematic change by collateral information. The proprietary term for this sort of plasticity is Cognitive Penetrability and you may hear me use that phrase a lot.
Although I have been talking very roughly about goals and beliefs, it still takes a lot of nerve to deny the relevance of such factors in determining behavioral regularities. Usually people only deny such obvious truisms when they are asked to give professional philosophical opinions, so it is surprising to find this kind of talk in a conference like this one.
Let me add just another example of when such an explanation is called for so that I can have a less facetious example to refer back to later. The question is, How do people know what the underlined pronoun refers to in each of the following sentences (based on an example due to Terry Winograd).
People readily understand these sentences, which entails that they assign a reference to the pronoun in each case. Surely, the explanation for how the reference is assigned must mention what the listener knows about city councilors, workers, demonstrations, communists, and so on. Only factors like this would explain why in particular cases the pronouns are assigned different referents in the two sentences and why the reference assignment could be easily changed by altering the information available to the listener. For example, a few years ago I inadvertently provided an example of the effect of collateral knowledge when I used these sentences in a talk I gave in Florence, Italy. I had forgotten that the city council of Florence was in fact communist. Because of this my audience had assigned the same referent to the pronoun in the first two sentences (which were the only ones I used in that talk) and the point of the example had been lost on them!
Notice that the point here is not merely that the process of pronomial reference- assignment is context-dependent. The point is that it is dependent in a rational way on the semantic information provided by the context. Moreover, what constitutes a relevant context can reach back into the indefinite past and become relevant through arbitrarily long chains of inference. It does not matter how you come to believe that the city councilors in question are communists -- if you are paranoid enough you might infer it from the most tenuous evidence, as did Senator McCarthy. Yet no matter how you reached that belief, your pronoun-assignment process would systematically and rationally take it into account.
OK now here comes one of the most important morals of this story. This is the hard part, so make sure you are listening carefully. Because of the sorts of considerations sketched here, cognitive science will not be able to get away solely with theories that tell us how the brain is wired or how activation spreads or how states of equilibrium are reached. That's because all such theories -- of necessity -- treat the cognitive system as being in causal contact with a environment. They cannot treat environments as the organism interprets them because interpreting is a relation that requires both environmental stimulation and reasoning from beliefs, goals, and expectations. This, in turn, requires inferential processes which such systems cannot support -- for reasons to which I will return in a moment. Because of this confinement to causal relations, such systems respond to physical properties of the environment (since these are the only ones that enter into causal laws). And that has been the Achilles' heal of every naturalistically-motivated school of psychology since Pavlov -- or perhaps Socrates (everything seems to go back to Socrates).
The problem is that the relevant relation between a cognizing organism and its environment is not a physical-causal one, but an informational, or to be more precise, an interpretive (or what some philosophers call an intentional) one. If you try to relate intelligent behavior to physically described environments you will come to the conclusion that, except for tripping, failing, bouncing off walls, and so on, behavior is largely independent of the environment -- which is clearly false. That was the elegant message of Chomsky's critique of Skinner way back in 1959. The plain fact is that people do things for such reasons as that they wish to find the holy grail, they wish to win someone's love, or they wish to get tenure. Yet neither the grail nor the presumably nonexistent love or tenure cause the behavior, only a representation of them can enter into the determination of the behavior. We reach for a cup of coffee not because of the coffee in the cup, for there may be no coffee in it after all, but because we believe there to be coffee, or because we see the black stuff there as coffee.
Once you accept this (surely obvious) story you have to face a further problem which, until recently, was thought to be insurmountable by a materialistic theory. That is the question of how a physical system like a mind can behave in ways that are coherently described in terms of goals and beliefs -- i.e. in terms of things that refer to aspects of the world that are not physical inasmuch as they are the world-as-believed (indeed, as in the previous examples, the relevant aspects may not even exist). Now, within the past half century, we have had the first serious proposal for how this might be possible. That proposal started with the formalist movement in mathematics and logic early this century, but the version that is of particular relevance to cognitive science is what Al Newell calls the Physical Symbol System Hypothesis. The proposal is that meanings are encoded as symbolic expressions and inferences are carried out over them by symbol manipulation.
The critical fact is that nobody has the slightest idea how inference can be done without such a process. You might say that nobody has a notion of a nonsymbolic reasoning process. Moreover, there are some good arguments why you will never have a reasoning system that fails to meet certain conditions. Among those conditions is the requirement that the states of such a reasoning system must have a componential and combinatorial structure. In other words, the states must have functionally distinguishable parts that can combine in novel ways, just the way symbolic expressions do. Furthermore, the component parts must be replicable -- that is, the system must be able to have and to functionally distinguish repeated tokens of the same substate type. That's because it is constitutive of reasoning that you be able to think different things about the same objects of thought. If an intelligent system can think that apples are red it must also have the capacity to think other things about apples (minimally that there are such things as apples). For example, if it can think that apples are red and that sugar is sweet, it must have the capability of thinking that apples are sweet (whether or not it does, of course, is another matter).
Another reason you can't get away with a system that has one brain state for apples- are-red, another f or apples-grow-on-trees, a third f or apples-are-a-type-of-fruit, a fourth for fruits are sweet and so on, is that such a system lacks the intrinsic capacity to draw inferences to new beliefs. It could not, for example, infer that there are such things as apples, let alone that apples are sweet. No matter how much additional complexity is built into such a system it cannot infer new beliefs of this sort because it does not have access to the component parts of the fused brain-states that would be required to trigger the appropriate state-transition. It's theapple part of the apples-are-a-type-of-fruit brain state, together with the recognition that the fruit part also occurs as part of the fruits-are- sweet brain-state, that allows the inference process to take place. Furthermore, without the componential structure you could not add new information about apples, such as that apples are edible, in such a way that it would lead to rationally connected behaviors -- for example, so that it would lead the system to cat those red fruity things that grow on trees, or to look in trees when it got hungry, or to answer such questions as "What are those red fruity things that grow in trees?"
And finally, as I argued many years ago in my critique of mental imagery theories, one of the nice properties that symbolic systems have is the ability to be indefinite or vague in certain well-defined ways. Unlike the representational states of fused-state systems (or systems whose "parts" do not correspond to the semantically interpreted aspects that occur in the knowledge-level description), the symbolic systems can represent such indefinite states of affairs as John is married to either Mary or Helen in such a way that the indefiniteness will interact properly with additional information (e.g. the new information that John is not married to Mary).
Notice that all these conditions are trivially met by a system that can write and read symbolic expressions. So far as I can tell, if a system has the properties of componential structure and replicability of substates, it will be essentially a physical symbol system -- though it's a bit early to tell if that's all it needs. So if you accept that people's relation to their environment is an informational one involving the sort of (typically unconscious) reasoning that was illustrated in the pronoun reference example, you are pretty much stuck with symbol processing. It's the only game in town -- or, as Fodor puts it "the only straw afloat".
So finally,... if you have been paying close attention, you will find yourself drawing the inevitable conclusion that in order to understand cognitive processes one needs to understand at least two different kinds of system organizations.
1. One needs theories of the mechanisms that support knowledge-based or rule-based reasoning. This includes theories of those parts of the perceptual and motor systems that are themselves noninferential and are therefore cognitively impenetrable. It also includes theories of the functional architecture -- that part of the system which allows rules and representations to be encoded and accessed. There is no doubt that this level of organization has a lot of parallelism -- maybe even enough to merit the phrase "massively parallel" -- after all, at the level at which the architecture is implemented (i.e. at the register transfer level) the VAX is "massively parallel"!
2. One needs theories of the reasoning competence of intelligent systems. These theories could turn out to be similar to the normative logical systems around in philosophic logic, although the evidence to date (mostly from Al) suggests that current logics are not expressive enough and current inference systems are not powerful enough to deal with common-sense reasoning.
Before letting you off the hook so you can get to the reception I want to add a few caveats and remarks so you will see how completely reasonable this view really is.
> What about early vision, etc?
First thing is to recognize that although a great deal of mental activity is of this knowledge-based sort, not all of it is. Moreover, as we are discovering, quite a lot of what was thought to be reasoning turns out not to be. For example, contrary to the teachings of the "new look" movement in perception, a lot of perception is not knowledge-based (e.g. what is called early vision, which appears to include everything up to the construction of the depth map or 2.5-d sketch). Moreover, although the evidence is not all in, I would not be the least surprised to find a lot of perceptual and motor learning to be in the non- representational category, as well as such aspects of language comprehension as lexical lookup and maybe even morphology -- as Jay McClelland proposed in his discussion this morning. Memory retrieval is surely another process that has a significant architectural component in humans as it does in electronic computers, and it almost certainly involves a quite different mechanism from that of retrieval by address. And finally, I suspect that the explanation of much of concept-learning and such phenomena as the effect of moods, emotions and many psychopathologies will involve a major component that is not knowledge-based. Consequently, I have high hopes that the PDP, Boltzman, and new connection people can make significant inroads in these areas. But not, as I have already asserted, where inference is clearly involved, unless these systems are overlayed with another level of organization which supports symbol manipulation functions.
> Can't everything be explained this way?
Belief -desire explanations can be given for almost any phonenomenon: rivers flow to the sea because they like the salt water and believe they can get there by following a downward path. Thinking in such belief-desire terms can even get one into unnecessary difficulties. Consider the thermos bottle. It keeps warm liquids warm and cold liquids cold. A remarkable feat because how does it know? The answer to this challenge is that (1) There is no guarantee that we are not making a mistake in a particular case, (2) De Morgan's Canon tells us that we should never postulate a higher function when a lower one will do, (3) The sina qua non of a truly knowledge-based process is that it is logical labile -- i.e., it is in principle cognitively penetrable®, and (4) if you think -- as behaviorists did when they accused Tolman and other early cognitivists of giving vacuous explanations -- that it’s an easy matter to throw together a set of beliefs and goals to account for any piece of behavior, then you should try writing an intelligent expert system in even a narrow domain.
It's true that almost any intelligent behavior can in principle be explained in terms of goals and beliefs. But almost any chemical reactions can also be explained by the hypothesis that there are many different kinds of molecules that interact in certain ways. Neither of these explanations need be vacuous because there are a large number of constraints that must be simultaneously satisfied for the explanation to work. In both cases the various details have to be independently motivated and validated.
> What about chicken-sexing?
One of the things that bothers people about what I'll call the cognitivist line is that it postulates a whole lot of inferences, reasoning, and problem solving and other sorts of ratiocination going on where introspection reveals nothing. When we introspect, for anything but slow and deliberate problem-solving we are convinced that the answer just emerges from some general disequilibrium followed by a leap of intuition. Even when we do have some subjective experiences during problem solving these often do not reveal the sort of activity that would qualify as "reasoning." Thus, Rudolph Arnheim attacks the Evans geometry- analogies system on the grounds that when people do these problems they go through a sequence of states involving a "rich and dazzling experience" of "instability" of "fleeting, elusive resemblances" and so on, whereas Evans' program doggedly pursues a search involving constructing pattern descriptions and description-differences. Similarly Burt Dreyfus accuses cognitive science of ignoring the difference between fringe and focal consciousness, and Steve Kosslyn accuses me and others of not taking seriously subjects' introspective reports of what goes on during episodes of reasoning using imagery. But one can no more build a theory of cognitive processing on the evidence of introspection than one can build a theory of solid state matter on the evidence of the appearance of hard objects. A theory of visual perception built on such evidence would be like this graphic Kliban theory of what goes on in the Cat's mind during cat perception (which may, incidentally, even be true in the case of the cat).
The trouble is that our introspections are not observations of cognitive processes. They are reports of the content of our thoughts, or of what our thoughts areabout. They can no more be taken as reports of mental structures or processes than they can be taken as reports of what neurons are doing. Despite this rather obvious criticism of introspective reports, I'm sure that the discrepancy between our experience during certain cognitive episodes and what cognitivist theories posit to be going on, is behind a lot of the discomfort that people feel, and is at least part of the attraction of analog and other nonsymbolic theories. But in this case it is the discomfort that will have to yield in the face of empirically unavoidable conclusions, such as that the process of assigning a referent to a pronoun involves a considerable amount of unconscious inference. Eventually, people will feel at home with such ideas, the way they got used to equally unintuitive physical theories.
By the way it turns out over and over that when you look closely at such skills as chess or chicken-sexing (both of which Dreyfus sites as examples of intuitive, as opposed to reasoning, processes) you find that there is a large component of reasoning involved (as Irving Beiderman has shown in the case of chicken-sexing).
> Isn't this way of talking just a rough approximation?
Sure human behavior is not all rational or even@ coherent. When you examine it closely you find that people consistently make certain kinds of logical errors (such as those investigated by Tversky and Kahneman) and there are exceptions to any rule- system. In that sense any particular symbol-system model is an idealization. And. as with all idealizations, there is always the question of whether the boundaries have been properly drawn.
Now it seems to me that we have two options in dealing with the "approximate" nature of rule governed theories. The first is to say that the divergence from pure rule- following that we observe in human behavior arises out of the interaction of the rules and representations with properties of the mechanism or architecture. These impose resource constraints and unreliability on both the representations (e.g. some of the beliefs may be false) and the processes (e.g. some valid inferences may not be drawn). This is the traditional way of dealing with idealization in science; empirical observations are assumed to be the result of an interaction of the ideal with other factors.
The second option is to deny the idealization entirely. But then you are faced not only with the problem of providing an explanation of the cases that deviate from the ideal, but also of providing an alternative explanation for the cases that would have been clearly covered by the idealized theory, such as in the examples I gave earlier (e.g. the explanation of why I am here or of how the referent to the pronoun is assigned). In such cases there seems to be no way of escaping the view that interpretation, inference, and decision-making are all relevant operations. And as I said earlier, there is at present no way to deal with such processes other than by a symbol processing theory. Of course, in the future someone might come up with an entirely new conceptualization of the whole process that unifies the clear and the deviant cases, but at the moment this is no more than an off-the-wall pious hope, bred of frustration with the admittedly limited success of building grand theories of intelligent behavior. But when you are thinking about where to place your bets you should took at what it is that draws you to the various options and ask whether there is any basis for optimism in the nonsymbolic camp. My own view is that, although some of these approaches are promising as ways of suggesting new mechanisms, their record in dealing with reasoning is at best nonexistent. Moreover, judging by the nonsymbolic models that have been proposed for such processes as mental imagery, the trend is in the direction of Rube Goldberg style ad hocery. There comes a time when one should reconsider the hairy-network theory of the mind -- as this spider has so wisely done with his network:
Might not the "explicit reasoning" form of processing be confined to a negligibly small part of the system's activity?
The question of how much of what we now think of as cognition involves reasoning is not a fundamental question that threatens the whole representational approach. As I said earlier, it may well be that a lot of what we pretheoretically think of as cognition will be attributed to fancy memory storage and retrieval mechanisms, elaborate sensory systems, or pararneter-setting learning mechanisms, rather than symbolic reasoning. These are empirical questions and we shall have to wait and see how it turns out. But to those who think that perhaps the only thing that will be left for representations to do is deliberate linguistically-mediated puzzle-solving I have two warnings. First, the evidence from information-processing studies is against you. Careful studies regularly reveal problem-solving processes taking place where people thought there was only intuition (e.g. studies of subitizing). Second, if you allow yourself a residue of reasoning somewhere in the system, there is always the danger that this is the part that is doing all the work. Extreme examples of this are the cognitive map theories (and some current holographic theories) that need a homunculus to make them act. My favorite contemporary example of this is the model of imagery that posits an internal display which is examined by the "mind's eye." I have argued that you can account for any possible result using such a system, providing you allow the "mind's eye" to act as an executive which examines the display and sets such parameters as the decay time, the rotation rate, the strategy for scanning the display and so on. But in that case it turns out that you don't need the display anymore since all the empirical phenomena are being accounted for by the executive. The point is that once you allow some reasoning into your nonsymbolic system you have to watch that the nonsymbolic part is not redundant. The paradigm example of this trap is the following story I first heard from Hilary Putnam.
The story is about an engineer who claimed for years that he had invented a perpetual-motion machine. Of course nobody believed him, but because he was a nice person they continued to humour him. Finally his bragging became too much and they decided to take him up on his offer to demonstrate the machine. The engineer led a group of his friends down to his basement, which was filled with a wondrous array of gears, levers, cogs, belts, and other visual marvels that glittered under the lights. His friends were much impressed by his workmanship. But one of them said to the engineer, "This is all very ingenious and wonderful, but there is one thing that bothers me. I notice that nothing is moving." Oh that, replied the engineer. That's nothing serious. It's just that I'm missing one part which is back-ordered. It’s a little lever that fits here and goes back and forth like this for ever."
There is the ever-present danger that the little bit of remaining reasoning capability we will need to allow in our nonsymbolic system is just this hook.
> What do I think of Cognitive Science?
Finally, my conclusion. What do I think of Cognitive Science, I heard you ask (didn't you?). I have always found psychology depressing because I came into it from physics and engineering thinking that, since it experimentally studied the human mind it was a science. I soon realized that it was not a science but a catalog, and a methodology for adding to the catalog. I don't doubt that it is a useful catalog: it's certainly important to know such things as how to help people who are depressed or to understand how people's memory or opinions can be changed in emotional contexts or by clever questioning (say in eyewitness testimony). But many of us had hoped that there was a theoretical science like physics or chemistry there somewhere and we were disappointed. I now believe that the problem is simply that there is no unitary subject matter for psychology -- it is not a natural scientific domain. But I find renewed hope now that within psychology lies one or more natural scientific domains, and that cognition, suitably circumscribed to include those aspects that are explainable in terms of symbol processing operations (together with the nonsymbolic mechanisms required to support symbol processing) may be one of those natural scientific domains.
Thus my answer to the question, What do I think of Cognitive Science? is exactly the answer that Mahatma Gandhi is alleged to have given to a western reporter who asked him what he thought of Western Civilization. Gandhi is said to have replied without hesitation, "Yes, that would be a good idea".
 This is the unexpurgated text of the presidential address, presented to the Cognitive Science Society's Eighth Annual Conference, Amherst, Massachusetts, August 16, 1986. It was an after-dinner talk and should be read in that spirit, even though there is a serious message hidden in there somewhere.
 When I say inference here I don't just mean logical deduction. I also include such meaning-dependent steps as default assumptions, heuristic reasoning, and even pure guesses. Psychological inference may have its own brand of quasi-logic.
 That is not to say, by the way, that connectionist or PDP or some other highly parallel and nonsymbolic system will not do inference. That's clearly not the case since, for example, a network of neurons such as the brain does it and a complex electronic network such as the VAX computer does it. The point is simply that in order to do reasoning or inference, such systems must have an additional level of organization overlaid on top of the network level of organization -- just as does any modern computer. The underlying implementation of the symbol-processing level of organization may well consist of a connectionist-like architecture.
Information for Class Use
What is Cognitive Science: PowerPoint slides of Lecture
Do we think in images? (click here for the PDF version)
Additional Readings for course "Things and Places"
Manuscript of MIT Press book "Things and Places: How the mind connects with the world"
Special Courses and Readings for Spring 2012
16:730:675 Seminar in Philosophy of Mind (Phil department listing)
16:185:600 Philosophy of mind (Cognitive Science Listing)
Readings can be found at: Readings for the Fodor/Pylyshyn course (including paper by Joe Levine)
Dr. Zenon PylyshynCenter for Cognitive Science(Department of Psychology, Center for Cognitive Science)(848) 445-1609 Email: zenon at ruccs.rutgers.edu
[Note: replace " at " with "@" -- this is to frustrate spammers who get their email addresses by automatic web-page scanning]
Cognitive Science Society Presidential Address Spring and Fall Fashions in Cognitive Science!
If you are browsing at this site you may also be interested in a talk I recently dug out. As president of the Cognitive Science Society in 1986 I was asked to inaugurate the tradition of after dinner Presidential Addresses. The talk I gave was partly a serious plea for a certain view of the field of Cognitive Science I felt had led to the creating of the Society, and partly it was stand-up comedy, aided by a variety of cartoons purloined from the New Yorker and from Gary Larsen. Since that time I keep running into people at parties who remember me, not for the brilliant research I have done, but for the talk. I have even had people perform some of it better that I could have done!
If you are interested in the talk you can access it here, but I would urge you to click on the cartoons and other visual aids only after they have been introduced in the talk.
Click here for the talk Spring and Fashions in Cognitive Science.
ZENON W. PYLYSHYN
Board of Governors Professor of Cognitive Science,
Center for Cognitive Science and Department of Psychology, Rutgers University
Zenon Pylyshyn received a B.Eng. in Engineering-Physics from McGill University, an M.Sc. in Control Systems from the University of Saskatchewan, and a Ph.D. in Experimental Psychology from the University of Saskatchewan for research involving the application of information theory to studies of human short-term memory. Following his Ph.D. he spent two years as a Canada Council Senior fellow and then joined the faculty at the University of Western Ontario in London, where he remained until 1994 as Professor of Psychology and of Computer Science, as well as honorary professor in the departments of Philosophy and Electrical Engineering and Director of the UWO Center for Cognitive Science. In 1994 Pylyshyn joined the faculty of Rutgers University as Board of Governors Professor of Cognitive Science and Director of the Rutgers Center for Cognitive Science.
Pylyshyn is recipient of numerous fellowships and awards. He was awarded the Donald O. Hebb Award from the Canadian Psychological Association in June 1990, “for distinguished contributions to psychology as a science”. He is a fellow if the Canadian Psychological Association and the American Association for Artificial Intelligence. He has been a Killam Fellow, a fellow of the Center for Advanced Study in the Behavioral Sciences at Stanford, a fellow at the MIT Center for Cognitive Science and a fellow of the Canadian Institute for Advanced Research (CIAR). In 1998 he was elected Fellow of the Royal Society of Canada. In 2004 he was awarded the Jean Nicod Prize in Paris and delivered the Jean Nicod lectures. He is past president of two international societies: the Society for Philosophy and Psychology, and the Cognitive Science Society. For 9 years (1985-1994) he was national director of the Program in Artificial Intelligence and Robotics of the Canadian Institute for Advanced Research. He is on the editorial boards of eight scientific journals and has been on several industrial or academic scientific advisory boards.
Pylyshyn has published well over 100 scientific articles and book chapters, including a paper designated as a Science Citation Classic ("What the Mind's Eye Tells the Mind's Brain", Psychological Bulletin, 1973) and has given over 200 talks and keynote addresses. He is author of Things and Places: How the Mind Connects with the World (Jean Nicod Lectures Series, MIT Press, 2007), Seeing and Visualizing: It's not what you think (MIT Press, 2004) [Winner of the Association of American Publishers Professional/Scholarly Publishing Division Annual Awards competition], Computation and Cognition: Toward a Foundation for Cognitive Science (MIT Press, 1984), as well as contributor/editor of five books, including: Perspectives on the Computer Revolution (1988); Computational Processes in Human Vision: An Interdisciplinary Perspective (1988), The Robot's Dilemma: The Frame Problem in Artificial Intelligence (1987), Meaning and Cognitive Structure: Issues in the Computational Theory of Mind (1986), and The Robot's Dilemma Revisited (1996). As chairman of an NSF-sponsored panel on artificial intelligence, Pylyshyn also helped to produce a major survey of the state-of-the-art in artificial intelligence which appeared as part of the book What Can be Automated? (1980).
For the past fifteen years, Pylyshyn's personal research has dealt with two general areas. One is the theoretical analysis of the nature of the human cognitive system that enables humans to perceive the world, as well as to reason and imagine. This has led to a number of theoretical investigations of the "architecture of the mind". On the experimental side Pylyshyn has been concerned with exploring his Visual Indexing Theory (sometimes called the FINST theory), dealing with how human visual attention is allocated and how humans cognize objects and space. This theory hypothesizes a preconceptual mechanism by which objects in a visual scene can be individuated, tracked, and directly (or demonstratively) referred to by cognitive processes prior to their properties being encoded. Over a dozen papers have been published on this theory and its experimental investigation, as well as its implications for understanding how vision is connected with the world, making perceptual-motor coordination possible. The theory has implications for philosophical issues concerning the semantics of visual perception as well as practical applications for the design of human-computer interfaces.
Poster for VSS2012 is here: Effect of Occlusion and Landmarks on Single Object Tracking During Disrupted Viewing
PUBLICATIONS (and PREPUBLICATIONS) *For actual reprints for papers marked with * please email the author
Pylyshyn, Z. W. (2009). Perception, Representation and the World: The FINST that binds. In D. Dedrick & L. M. Trick (Eds.), Computation, Cognition and Pylyshyn. Cambridge, MA: MIT Press.
The empirical case for bare demonstratives in vision. In R.J. Stainton, C. Viger (Eds.) Compositionality, Context and Semantic Values: Essays in Honour of Ernie Lepore, Springer (2008).
(With S. Franconeri, J.Y.Lin, B.Fisher, A.T.Enns). Evidence against a speed limit in Multiple Object Tracking. Psychonomic Bulletin and Review, 2008, 15(4), 802-808.
(With H. Haladjian, C. King & J.Reilly) Selective nontarget inhibition in Multiple Object Tracking (MOT). Visual Cognition, 16(8), 1011-1021
(With D. Dulin, Y. Hatwell, and S. Chokron). Effects of peripheral and central visual impairment on mental imagery capacity. Neuroscience and Biobehavioral Reviews, 32 (8), 1396-1408.
Imagery. In Gregory, Richard. Oxford Companion to the Mind (Second Edition, 2006) Oxford University Press
(With V Annan) Dynamics of Target Selection in Multiple Object Tracking (MOT) (Spatial Vision, 19(6), 485-504)
Some puzzling findings in multiple object tracking (MOT): I. Tracking without keeping track of object identities Visual Cognition, 2004, 11(7), 801-822
Some puzzling findings in multiple object tracking (MOT): II. Inhibition of moving nontargets Visual Cognition, 2006, 14(2), 175-198
(With Brian Keane) Is motion extrapolation employed in multiple object tracking? Tracking as a low-level, non-predictive function Cognitive Psychology , 2006, 52(4), 346-368
Return of the mental image: Are there really pictures in the head? Trends in Cognitive Science, 2003, 7(3), 113-118
Mental Imagery: In search of a theory Behavioral and Brain Sciences, 2002, 25(2), 157-237
Is the "imagery debate" over? If so, what was it about? Cognition: a critical look. Advances, questions and controversies in honor of J. Mehler. E. Dupoux (Ed). Cambridge, MA, MIT Press.
(With E. Blaser and A.O. Holcombe) Tracking an Object Through Feature Space, Nature, 2000, 408(Nov 9), 196-199 [PDF]
Visual Indexes, Preconceptual Objects, and Situated Vision Cognition, 2001, 80 (1/2) (PDF file).
(With B. Scholl & J. Feldman) What is a visual object: Evidence from target-merging in multiple-object tracking Cognition, 80 (1/2) 159-177 (PDF file).
(With B. Scholl & S. Franconeri) The relationship between property-encoding and object-based attention: Evidence from multiple-object tracking. (Unpublished: If you would like a copy of the manuscript, write to one of the authors).
Situating vision in the world Trends in Cognitive Science, 4(5), May 2000, pp 197-207 (PDF File)
Is vision continuous with cognition? The case for Cognitive impenetrability of visual perception. [pdf file] In Behavioral and Brain Sciences, Vol 22, No 3, Jan 1999, p341-423 or click here for the long reprint file of the entire article (including commentaries)
(with C.R. Sears) Multiple object tracking and Attentional Processing. Canadian Journal of Experimental Psychology. 2000, 54(1), 1-14 [PDF file]
(with B. Scholl) Tracking multiple items through occlusion: Clues to visual objecthood. Cognitive Psychology, 1999, 38(2), 259-290. [PDF file]
The role of Visual Indexes in Spatial Vision and Imagery. In R. Wright, Visual Attention. New York: Oxford University Press, 1998. [PDF File]
What's in Your Mind?. In: E. Lepore & Z. Pylyshyn (Eds), What is Cognitive Science? [PDF File]
(with J.A. Burkell) Searching through subsets: A test of the visual indexing hypothesis. Spatial Vision, 1997, 11(2), 225-258.
Computing in Cognitive Science. In Posner, M. Foundations of Cognitive Science. Cambridge: MIT Press (A Bradford Book), 1989.
Computers and the Symbolization of Knowledge. In Morelli, Anselmi, Brown, Haberlandt & Lloyd (Eds.) Minds, Brains and Computers: Perspectives in Cognitive Science and Artificial Intelligence. Ablex, 1993)
(with Burkell, Fisher, Sears, Schmidt & Trick) Multiple parallel access in visual attention.. Canadian Journal of Experimental Psychology, 1994, 48(2), 260-283.
Primitive Mechanisms of Spatial Attention.. Cognition, 1994, 50, 363-384.
The role of location indexes in spatial perception: A sketch of the FINST spatial-index model.Cognition, 1989, 32, 65-97.
(with R. Storm) Tracking multiple independent targets: evidence for a parallel tracking mechanism. Spatial Vision, 1988, 3(3), 1-19. (formulae under construction)
Rules and Representations: Chomsky and Representational Realism. In A. Kashir (Ed.), The Chomskian Turn. Oxford:Basic Blackwell, 1991.
Connectionism and Cognitive Architecture (with J. Fodor) Cognition, 1988, 28, 3-71
How direct is visual perception? Some reflections in Gibson's `Ecological Approach' (with J. Fodor) Cognition, 1981, 9, 139-196