Brains, Minds, and Machines: Keynote Panel—Why is it Time to Try Again? A Look to the Future
POGGIO: So now I will start the real symposium. And I have the pleasure to introduce the moderator of this extraordinary panel, Steve Pinker. You all know Steve, so he does not need an introduction. You all know him from his books, starting with The Language Instinct, a great book. Steve was a colleague in my department here at MIT, and now we are loaning him back for a few days from Harvard. Steve.
[APPLAUSE]
PINKER: Thank you Tommy. Welcome to the golden age-- a look at the original roots of artificial intelligence, cognitive science, and neuroscience. Moderating this panel is a daunting challenge that puts me in mind of the famous quotation from John F. Kennedy, when he hosted a dinner at the White House for all of the living Nobel laureates of the Western hemisphere. And he said "This is the most extraordinary collection of talent and human knowledge that has ever been gathered together at the White House with the possible exception of when Thomas Jefferson dined alone."
It's not clear who the Thomas Jefferson figure equivalent would be in this case, although I imagine there might be some people in the audience who would nominate our late colleague, the inimitable Jerry Lettvin. The second reason this is a daunting challenge is that our distinguished panelists represent a remarkable diversity of interests from the biology of soil roundworms to the nature of logic, truth, and meaning. Nonetheless, I think there is a common thread-- that all of them are contributors to what I think of as one of the great revolutions in human thought, which is the scientific understanding of life and mind in terms of information, computation, and feedback and control.
Well into the 20th century, life and mind remains scientific mysteries. Life was thought of as this mysterious substance called protoplasm some kind of quivering gel animated by a elan vital. The mind was thought of as a portion of some realm of the soul or spirit, or, according to the dogma of behaviorism, something that didn't exist at all, just one big category error. But then in the middle decades of the 20th century, ideas of thinkers like Turing, Church, von Neumann, Weiner, Shannon, Weaver, McCulloch, and Pitts, gave us a rigorous language in which to understand the concepts of information and computation and apply them to domesticate these formerly mysterious realms. In the process, revolutionizing biology and psychology.
They gave what became the insight that the stuff of life is not some magical protoplasm, but rather matter that's organized by information. And today, when we discuss heredity, we use the language of linguistics. We talk about the genetic code. We talk about DNA sequences being synonymous, or meaningless, or palindromic, or stored in libraries. Even the relation between hereditary information and the actual meat and juices of the organism, we explain with concepts from information, namely transcription and translation. The metaphor is profound.
Similarly, the stuff of thought is no longer thought to be some kind of ghostly spirit, nor a mirage, or a category error, but also can be understood in terms of information. That beliefs are a kind of representation. Thinking, a kind of computation or transformation. An action, a problem of control in the engineer's sense.
These ideas we take for granted now, but I am always struck going back to earlier great thinkers in biology and psychology-- how much they floundered without it. When one reads great philosophers of mind like Hume or great biologists like Darwin, I often wish that I could reach back over the centuries and tell them a few things about the modern science of information, because one could see that they were flailing around with hydraulic and mechanical analogies that could be so clearly explicated by what we know now about information and computation.
I think it was the 1950s and early 1960s that was a turning point in both fields. And the six people in today's panel were all, in different ways, instrumental in making it happen. I don't think I'm going to offend anyone's vanity by introducing them and inviting them to speak in order of age. Sydney Brenner is a biologist. All of us know that 58 years
[INTERPOSING VOICES]
PINKER: Everyone knows that 58 years ago, Watson and Crick explicated the structure of DNA. But DNA would be useless if there wasn't some way for the information that it contained to actually affect the development and functioning of the organism. The discovery of the genetic code is something that we owe to our first speaker, Sydney Brenner, working in collaboration with Francis Crick and others.
As if that weren't enough, that the discovery of the operation of the genetic code in mechanisms of translation into-- transcription into RNA and translation into protein, Sydney Brenner was also instrumental in the modern science of development-- how the information encoded in the genes actually builds a three-dimensional functioning origin. And for that matter, neuroscience. Both through his choice of the lowly roundworm, Caenorabditis elegans, C. elegans, often known as Sydney's worm, which has exactly 959 cells, 302 neurons, and therefore offers a perfect opportunity to reverse engineer the process of development and the wiring of the nervous system.
In recognition of this accomplishment, Sydney was awarded the 2002 Nobel Prize in physiology or medicine. But perhaps an even greater recognition is that he has been immortalized in the Linnaean taxonomy. A sister species of C. elegans, Caenorabditis brenneri has been named after Sydney Brenner. Sydney is a senior distinguished fellow at the Crick-Jacobs center at the Salk Institute for Biological Studies.
Marvin Minsky is a computer scientist, and he is widely recognised as one of the founders, perhaps the founder of artificial intelligence, cognitive science, and robotics. He is responsible, among other things, for the first neural network simulator, the first computer simulation of semantic memory, one of the first music synthesizers, the first mechanical arm, and the first programmable Logo turtle.
He has since then been a major theoretician on how to apply computational analysis to problems such as vision, reasoning, learning, common sense, consciousness, emotion, and so on. He has been recognized with the Association of Computing Machinery's Turing prize, as well as the Japan price. But perhaps most significantly, he inspired Arthur C. Clarke when he was writing 2001-- A Space Odyssey and served as a consultant for that movie, in particular for the computer HAL. Marvin Minsky is Professor of Media Arts and Sciences, Emeritus at MIT.
Noam Chomsky is a linguist who revolutionized the study of language by changing the very nature of the questions that the field attempts to answer. Noam pointed out that the most profound mystery of language is, first of all, that any competent speaker can produce or comprehend an infinite number of novel sentences. Therefore, our knowledge of language has to be captured by a recursive generative system. Noam explored the mathematics of such systems, has developed a number of theories of what kind of recursive generative system is implemented in the human mind, and inspired the modern science of psycholinguistics that explores how language is processed in real time.
Noam also set, as a problem for linguistics, the project of figuring out how children acquire a first language without formal instruction in an astonishingly short period of time, suggesting that children are innately equipped with a universal grammar of the abstract principles behind language, an idea that led to the modern science of language acquisition. Noam is also largely responsible for the overthrow of behaviorism as the main dogma in the study of psychology and philosophy, for rehabilitating the philosophical approach of rationalism, and for making the concepts of innateness and modularity respectable in the study of mind.
Chomsky has been recognized, among many prizes, with the Kyoto prize, the Helmholtz medal, and the Ben Franklin medal. He is also the most cited living scholar, according to citation counts, and has become such a fixture of popular culture that Woody Allen had him feature prominently in his New Yorker story "The Whore of Mensa." And I believe that Noam is the only MIT professor whose lecture has ever been featured on the B-side of an album by the rock group Chumbawamba. Noam is Institute Professor Emeritus in the Department of Linguistics and Philosophy here at MIT.
Emilio Bizzi is a neuroscientist who made early contributions to our understanding of sleep, but has spent most of his career studying the process of movement. Emilio is responsible for setting the agenda for the neuroscience of movement, pointing out that the movement of animals is not simply a set of knee jerks or pressing a button and a lever moving but rather has goal-directedness that a simple muscle motion would be useless if it was executed the same way regardless of the starting position of the body, and rather that movements have to be organized towards positions in space.
Movement is therefore an intelligent process at every level of the nervous system. Emilo is also largely responsible for the maturation of neuroscience as a field, which did not even exist as a name when he entered the field, And. Was the founding director of the Whittaker College at MIT, and founding head of the Department of Brain and Cognitive Sciences, which I was privileged to teach for many years. In addition to his scientific brilliance, Emilio is widely respected as a man of judgment an erudition, and his wisdom has been tapped in many scholarly organizations. Together with prizes such as the Empedocles Prize, and the President of Italy Gold Medal for achievements in science prize, Emilio has served as the President of the American Academy of Arts and Sciences. Emilio is currently Institute Professor of Neuroscience here at MIT.
Barbara Partee is a linguist. She was a student in the first class of the fabled MIT graduate program in linguistics, and as the first recipient of a National Science Foundation Fellowship in linguistics, her graduate career symbolically inaugurates the appearance of linguistics as an official science. Barbara transformed the field of linguistics, which hitherto had concentrated on syntax and phonology by putting semantics, the study of meaning on a formal, rigorous foundation, tying linguistics to the world of logic and the concepts of meaning and truth.
She remains the world's foremost semanticist and moreover trained all of the world's other foremost semanticists. When she retired recently, her students presented her with a geneological tree of all of the people that she has trained. They noted that it has a depth of four, which she means that she has trained several great-great-grandstudents, and it has 176 nodes.
Barbara has been recognised with the Max Planck Research Award from the Max Planck Society, and is currently distinguished University Professor Emerita, that's feminine for Emeritus, at the University of Massachusetts at Amherst.
Patrick Winston is a computer scientist. He was one of the first researchers to give a rigorous analysis of the concept of learning in the framework of modern symbolic artificial intelligence, and his work on the concept of learning continues to be influential today. He was also instrumental in transforming artificial intelligence from a bunch of hackers in plaid flannel shirts to a respectable academic field through his textbook, through directing the artificial intelligence laboratory at MIT for most of its existence, and his famous course in artificial intelligence, as well as starting a number of companies.
Patrick has been recognized by a number of teaching awards, including the McVicar, the Baker, and the Graduate Student Council Award here at MIT, and is widely known and beloved aside from his contributions to artificial intelligence in his training of teachers, through his famous IAP talk How To Speak, which has influenced many of us, among the many suggestions for how to keep a class engaged, he suggested that every lecturer should cultivate some eccentricity, whether it be wearing a rope belt or tugging a lock of your hair, or erasing the blackboard with both hands. I was influenced by this as a postdoc, and ever since I have always lectured wearing a gaudy necktie. Patrick is Ford professor of Artificial Intelligence and Computer Science at MIT.
I have asked the six panelists to spend 10 minutes each sharing any personal thoughts that they think the audience would enjoy on the birth of the modern sciences of life and mind and their reflections on the origins, key questions, key discoveries, and open challenges of your field. So we'll start with Sydney. I'd like to ask each of you to speak for about 10 minutes, and then I will ask you to amplify, reflect, ask each other questions, and so on. I'll put a little timer here to just remind you how much time has elapsed.
BRENNER: I've shrunk in recent years. Well I thought I'd give-- to explain why I am here, I thought I'd give two classes of reasons. The first will be sentimental, and I'd like to say that my association with artificial intelligence has always been as an interested spectator. I don't think I ever played the game. My interest in it came from a very long friendship with Seymour Papert. We grew up together in South Africa. We shared a room in the Department of Physiology.
I taught Seymour neurophysiology. He taught me mathematics, and I came to the conclusion he was both a better student and a better teacher. It was the connection with Seymour, when I finally grew up, that brought me into association with the Department of Artificial Intelligence. And I sent one of my best students here. I wouldn't say he was a student-- one of my best colleagues here, David Marr. And so those are the reasons why I've always been, and in later years, which I ask you to ponder on, there was a great adventure here, producing something called a thinking machine.
In fact there was a corporation named after that. And I suppose that's what we are after. We have to resuscitate thinking machines. Now I'd like to give you the mental reasons why I'm here. I was very influenced in about 1951, it was, reading an article by John von Neumann, called the theory of self-reproducing machines. It had been published in a book called The Hixon Symposium of Cerebral Mechanisms. And it's a very interesting book to read through, because you can see all the false leads. That paper von Neumann went unattended to at the actual basic logical construct for the way that genes work. And that in fact, the whole lot of this is done without any reference to biology.
And indeed when the biologist discovered that this was the machinery, then nobody had mentioned von Neumann. In fact they all paid very great respect to Schroedinger. Schroedinger wrote a book called What Is Life. Everybody claimed to read it. I read it. I didn't understand it. So it had no influence on me. But in thinking back on it some years ago, I came to the conclusion that Schroedinger had made a fundamental mistake. He said the genes contain the program for development, as we would put it this day, and the means to execute them. What von Neumann said-- they didn't contain the means of execution-- they contain a description of the means of execution.
In other words, the program is not self-reading. You have to build a reader for it. And that's of course, what von Neumann-- and without this, you can't make a self-reproducing machine, because it has to transmit to the next machine a description of the means to do it. And I think that this is the fundamental thing that lies behind this. And so, if you like, if you want to say, I've got this I've got this text in DNA. It's a long sequence. Can we read it? Can I look in there and say, yes, that's a zebra, and it's going to be able to do these things. And that is if we believe in what we can do.
So when you take this and you try to analyze the relationship between what we inherit, and how we can perform, and the big argument at the time that there's a connection between genes and behavior, which gave rise to a whole lot of other problems, such as saying that intelligence is something that is inherited with the genome. Those are not the questions really to ask, because I think you must divide the problem into two, as indeed it is divided into two in von Neumann's view of it.
The first thing is how do the genes specify and build a machine that performs the behavior? And how does that machine perform the behavior? That is a separate question. Of course, the two are connected, as indeed they are. But they must be distinguished, because what we're asking is if we're looking at the behavior, the behavior is represented in the genome as a description of how to build a machine that behaves. And you see this is very important to get that through, because the deepest problem is how did all of this evolve, because you can only change the description.
So there are very interesting questions that are attached to this. And in following this line of thought, I thought that the only way to give a scientific theory of a nervous system is to ask, how does the wiring diagram, if I can call it that, compute behavior? Because if we know how this is done, we can look at the deeper computation later, which is how is the script translated into the machinery that builds this.
And in fact, I think a lot of science will now go to what I call the forward question, which is how do we connect the output of a system with its wiring diagram, which is the thing I think we have to solve. But of course, this is a grand thing, and one of the things you learn is that there is a difference between vision and eyesight. I learned this by describing a friend of mine as a man of great vision but poor eyesight.
So we have to have good eyesight to implement anything like this, and that is the story of the worm. The worm has 302 neurons-- seemed to me it was a finite problem. We could come to the end of the description, and then after that, we could deal with the questions that everybody raises when you do modeling, which is the skeptic who stands there and says, how do there's not another wire that started in the big tail, runs up at the back, and connects this with this?
You have to be able to say we know all the wires. Then you can proceed. So this was, then, to get to a total analysis of the structure of the nervous system, the structure of the brain of the worm. 302 neurons, which we finally accomplished. It took 20 years, because we had many, many other romances with computers in-between. We try to mechanize this, but to do this in 1960s was impossible.
Nowadays it can be done. And there's huge activity in getting these wiring diagrams specified. In my opinion, sometimes over-zealously applied, because it's not going to answer the question that we wanted to answer-- if I take a worm of exactly the same genetic structure, will I find exactly the same wiring diagram? Today you could ask, sure, I've just cut a section of this mouse's brain, and I found these synapses. And you say well maybe if you delayed your knife for half an hour, would you have found the same synapses?
And indeed, if you go to another mouse, and look at the same cell, it's going to be the same there? However, I think that in approaching this question-- I think that we are on the threshold of really asking very serious questions, both in biology and in relation to the behavior of complex systems. And that's why I think this initiative and it's a very, very important to do now, because I think we'll0 have to take a completely different approach to those in the future.
As you well know my colleague Francis Crick got interested in consciousness at one stage, and I was asked some time ago what did I think of consciousness as a problem. The great physicist Wigner thought that consciousness would be the explanation as why you can't predict anything with quantum mechanics. That there would be that this added ingredient.
My view is this. I think consciousness will never be solved, but will disappear. 50 years from now, people will look back and say, what did they think they were talking about here, right? I'll tell you one other thing that has the same thing and has disappeared. It is a thing called embryological determination.
We had many discussions in the '50s and '60s-- is determination different from differentiation? Are they different processes? Nobody talks that way anymore, because we have a completely different view of this. And this is what I would like to say-- is going to be changing those views, changing many of the other views, of which I'm not equipped to talk about.
But I tell you the one view that I would like to see very much changed, and that is the view of what a gene stands for. And in doing this, I will just deliver a little parable in my sermon. It was said that when we sequenced the human genome, and we sequenced the chimpanzee genome, we will find one extra gene in the human. That will be the gene for language, and we'll call it the Chomsky gene.
But there's an alternative explanation, which is during their evolution, chimpanzees discovered talking gets you into trouble. And so they evolved a language suppressor gene. Then we'll call it the Chomsky gene. Thank you very much.
[APPLAUSE]
Marvin?
MINKSY: I have too many notes. When Brenner started working on his worm, C. elegans, that was a big project. 1960, was it? And to map out the nervous system of this animal, as well as the rest of it, you had to make very small slices with a microtome and trace the nerves and other organs in every layer.
And that's a tremendous amount of work to do by hand, and he had employees, graduate students, I don't know what, and this process was taking a long time. At the same time, McCarthy and I had started the artificial intelligence group here and had done a lot of work on computer vision, even by 1960. So I visited Brenner, and suggested that he should import a couple of our graduate students who would automate the process of vision, and Brenner said no. And I said, why. And he said, well, both of our fields are in their very early stages, and I have these budding genetic biologists or whatever, and you have these budding computer experts, and if you let any of them into my lab, all of my students will realize that your subject is much more exciting and easier. Do you remember that?
[LAUGHTER]
And it took me a while to realize he was absolutely right. I don't know where to start. I love MIT, and I've been here since, in a sense, since 1957, because I started at Harvard as an undergraduate, and I could talk about that for two days. But in the course of-- and then I went to graduate school at Princeton. And I'm not sure where I'm going with this, but every time I've got interested in doing something, I was incredibly lucky. I never had to do any real work.
If I had a mathematics problem-- well, when I was a sophomore at Harvard, I sat down at a table with a young man named Andrew Gleason who had won the Putnam first prize in the Putnam mathematical competition three or maybe four years in a row, which sort of excluded everyone else in the world. And so I asked what he was doing, and he said he was trying to solve Hilbert's fifth problem, which is that every locally continuous group-- oh, I forget what it is. Anyway-- oh, every continuous group is differentiable. It's a very nice problem. And I said, how long do you think it will take to solve that? And he said nine years. I had never met anyone with a plan of that magnitude. Only a few years older than me, and I asked what that was, and he said well first I'll have to do Dedekind cuts to show that the thing has to have real numbers, and blah, blah, blah. Anyway, it took him eight years.
And I think somehow I was very early in my research career, and it seems to me that eight years was plausible. Let's do things and-- anyway, I was just very lucky. I ran into von Neumann at some point when I was doing my thesis at Princeton, which was on neural networks, and I got the idea of doing that from earlier work by McCulloch and lots of people in the 1940s who-- it turned out that in 1895, Sigmund Freud had written a little paper on neural networks, but no one would publish it. And it wasn't published till around 1950. Well, I guess the same question came up, because I was in the math department at Princeton, and they asked von Neumann is this PhD thesis mathematics, and so I'm happy to say that he replied, if it isn't now it soon will be.
And anyway, there was something great about the era. And it's hard to talk about how things started, because it was so different from it is today. See, 1950 is shortly after World War II, and the country was very prosperous, and there were lots of research institutions, because there were a lot of monopolies. I mean a company like General Electric or Westinghouse or even CBS were pretty huge, and Bell Laboratories, of course, was a monopoly. The others only acted like it. And so, they would start out different projects and John McCarthy and I, in the summer of 1951 or 1952-- anyway, we got summer jobs at Bell Labs, and someone told us not to work on any problem that would take less than 30 years. They didn't have the tenure deadline either.
Now to get tenure, you have to be quick, because legally, I think, tenure is seven years, but they like it to be six years, because you don't want to keep people hanging. And what's more, if you fire somebody before that, you have to pay about a year termination fee. And, oh well, blah, blah. So it's very hard to get a research grant for five years, which was sort of the standard that NIH and many of these laboratories had. So any questions?
Anyway we started this artificial intelligence laboratory, and the first few-- I'll just stop in the middle of a sentence when time is up, because I never make plans. I could talk-- what? Oh, oh, it's an iPhone.
[LAUGHTER]
I have one. Mine is pink. Anyway, how did all of this happen? When I was a child, there were a lot of books in the house, and I got interested in reading them. And it seemed to me that they fell into two classes. One was fiction, and that was novels. This is junk I've made many times. And I read a few novels, and it seemed to me that they were all the same, even by great writers from Shakespeare on down. And there were about six or seven sins or the major ways that people screw up their lives. And so I would pick a typical piece of fiction that describes a situation where something goes wrong, and you try to fix it, and then something else goes wrong. And then this goes on until finally you cure them all or you die. It doesn't make much difference. So that's what general literature is about.
And science fiction is about everything else. So somehow Isaac Asimov lived nearby when he had just finished, and just as I ran out of the Jules Verne, and HG Wells, and Aldous Huxley, and the John Campbell, and the early science fiction writers-- exactly the right people started to turn up, which was Robert Heinlein in 1942 I guess. And Isaac Asimov a little later, and Frederick Poe.
I met Isaac because he lived in Newton, which is nearby. And at this time, or somewhat later anyway, we were building the first robots. Pretty much, although a guy named Gray Walter, neurologist in the west of England had made some interesting ones. And I invited Asimov to come and see our robots, and he refused. And this happened a few times, and finally I said, why not. And he said, I'm imagining really intelligent robots of the future. And if I came, I'm sure that your robots are really stupid.
[LAUGHTER]
Which they were-- they could play a very slow game of ping pong and a few things like that. You might wonder why aren't there any robots that you can send in to fix the Japanese reactor that's broken, reactors I should say. The answer is that for the last-- there was a lot of progress in robotics in the 1960s and '70s. Then something went wrong. If you go to Carnegie Mellon, or-- I don't want to mention names-- or anywhere else, you'll find students are slaving over robots that play basketball, or soccer, or dance, or make funny faces at you, and they're not making them smarter. They don't use or realize that they need something like artificial intelligence.
So I'd like to see a return back to the old days. Brenner mentioned consciousness, I think, and I agree entirely that it's a very bad word, because if you look at my recent book, which is badly named The Emotion Machine, it says notice that people use the word consciousness for 35 very different things, like remembering what you've recently done, or making certain kinds of choices, or reacting to certain stimuli, and so forth.
And so virtually everything we don't understand about the mind when people do things, is attributed to this word. So anyway, that's one of the problems. And yet I see that professional people in various sciences still use the distinction between conscious and not. And as far as I can see, you can't get rid of that word, because what's its main use? It's main use is deciding when someone is a criminal. If they run over you by accident, well that's very bad, and you might lose your license. If you run over someone on purpose, that's intentional, conscious, blah, blah, blah.
And so we have the strange situation in psychology that I don't see in other sciences, which is to use words that have been around for a long time without-- I'm down on the 10% of my notes, so.
[APPLAUSE]
PINKER: Thank you. Noam Chomsky.
CHOMSKY: Well I kind of like that idea of the language suppressor gene. It actually has a venerable history, which maybe you know. In the 17th century, when westerners were discovering all kinds of exotic creatures like apes and blacks and others, and they weren't sure which was which, there was a lot of discussion about whether ape, orangutans, you know, apes can speak. And they didn't seem to, and there was a lot of debate about why they didn't. And there was a proposal, that there was-- they didn't call it genes in those days, but there was the analog of a language suppressor gene. Louis Racine, who was the son of the famous dramatist, suggested that apes are really more intelligent than humans. And the proof of it is that they don't speak, because they knew that if they did speak, we would enslave them. So they keep quiet, you know.
[LAUGHTER]
And this incidentally led to the conclusion that a brilliant father can't have a brilliant son. I was in RLE building 20 in the golden age. And it actually was a golden age, but I got here in 1955, and it was quite an exciting place. A lot of enthusiasm, innovation, a very lively interaction among people of widely different fields. It's a kind of community which I think will be very hard, if even possible, to reconstruct. I also have many very warm memories of it. In fact, some of them very hot memories. Those of you who were around in those days may remember that over the summer, building 20 was unbearably hot. You could barely survive.
Morris Howie and I then shared a small office, and we decided we would try to get the Institute to put in an air conditioner, a window air conditioner. So we sent a message up through the bureaucracy, and it finally reached whatever high office it got to, and something came back down to us finally with a message saying you can't put in an air conditioner, because it wouldn't be consistent with the decor of building 20.
[LAUGHTER]
Those of you who have ever seen a picture of a building 20 would know what this meant. Fortunately, we could find the friendly janitor, who for $10 was willing to break the rules, and we were able to survive the summer. Back a few years earlier-- in fact, going back to the time of the famous paper, Alan Turing's paper on machine intelligence in 1950. In the early '50s, there was a small group of graduate students down the road who were dissatisfied with the prevailing conceptions of how to understand thought, behavior, language, and so on.
And we were trying to see if we can construct another way of looking at these topics which might be more promising and integrated better into the general sciences. And Turing's comments had a certain resonance. You may recall that in his paper, he-- which was about machine intelligence-- he begins by saying that the question of whether machines can think is too meaningless to deserve discussion. He didn't explain why, but he presumably meant that it's a question of what kind of metaphor you are willing to accept.
So it's like asking do airplanes really fly or do submarines really swim, if you want to extend the metaphor, yeah. If not, no, but it's not a factual question. He nevertheless went on to say that it would be a very good idea to construct hard problems to see if you can design machines, meaning hardware and software to solve them. And the famous proposal of his was what he called his Imitation Game. Later, it came to be called the Turing test. He thought that might be an incentive to develop a better machines, better software. And for that reason, it's good to do it, but we're not asking do machines think.
He also suggested that, maybe, in the course of time-- he said 50 years, because of this work, people will come to think about thinking in a different way. You can ask whether that's happened. The machines-- it certainly was an incentive to develop better machines, and that had a certain analogy to the kinds of things we were concerned with. There is for one thing, because when you look at language seriously-- it wasn't really done at that time unfortunately-- you can see right off that what each of us has internalised is some kind of computational system, a system which, in fact, determines an unbounded array of structured expressions, each of which has a dual interpretation. It's interpreted by the sensory motor system as some kind of externalisation, sound, or we now know other modalities.
And it's interpreted at an internal, roughly speaking, a kind of thought system called sometimes the conceptual intentional system. As having a specific meaning, which then enters into the planning of action and so on. So that's a computational system, and the constant understanding of computational systems had advanced quite considerably by the mid 20th century, in large part because of Turing's work. And that sort fit naturally.
Also, his provisos made some sense for us. There is a standard question-- what counts as a language. And I think Turing's response was accurate-- the question is too meaningless to deserve discussion-- for the same reasons as machines thinking. If you want to call the communication systems of hundreds of species of bees language, OK, you can call it that. That means you're accepting a certain metaphor. If you don't want to call it that, don't.
It's not a factual question. The fact of the matter are these systems differ radically-- every animal communication system known differs radically in its basic structural principles of use, and others, from human language. And human language is in fact-- here it's different from Turing's case-- it is, in fact, a particular biological system, one that each of us has internalized. At its core, a computational system, with the kind of properties that I mentioned. And there's no problem here about constructing hard problems to deal with, because it was quickly discovered, as soon as the effort to construct such computational systems was undertaken, that almost nothing was understood. Everything you looked at was a puzzle.
And the more you learn, the more puzzles you discover. That was kind of surprising at the time, because it was assumed, the prevailing conception was that everything was more or less known. There aren't any real problems about language. It's just a matter of-- a famous version was Van Quine, a highly influential philosopher and most influential whose picture was that language is just a collection of sentences associated with one another and with stimuli by the mechanism of conditioned response. So all there is to do is just trace the individual cases of histories of conditioned response, and you get the answers.
Among linguists, it was very widely assumed that languages can differ in virtually any possible way, with virtually no constraints. And the only problem-- linguistics, I remember this as a student, was to collect more data from a wider variety of languages, and apply to them the techniques of analysis, which were assumed to be more or less understood. And that's the field. But nothing could be puzzling.
On the other hand, as soon as the effort was undertaken to construct an actual computational system with the properties that I mentioned, it was discovered that everything is a puzzle. Nothing is understood. And some of the puzzles are quite simple, and some of them are still outstanding, in fact. So just to be concrete, take the simple sentence-- can eagles that fly, swim? OK, we understand it means we're asking a question about whether they can swim. We're not asking a question about whether they can fly. But that's a puzzle. Why doesn't the can relate to the closest verb? Why does it relate to the more remote verb, one which, in fact, is closest in a different sense. It's structurally closest. So similarly, we can say are eagles that fly, swimming. But we can't say are eagles that flying, swim. Now that's a reasonable thought. It's asking is it the case that eagles that are flying, swim. That somehow we can say that-- the design of language prevents us from articulating a perfectly fine thought, except by paraphrases.
And there is a question why that should be true. From a computational point of view, computing the closest verb, computing linear order and linear closeness, is far simpler than computing structural closeness, which requires complex assumptions about the structure of the object. So there's a puzzle. And that puzzle have been around for thousands of years, except nobody thought it was a puzzle. It's one of innumerably many such cases. I shouldn't mention that the last couple of years, a sort of substantial industry has developed to try to deal with it on computational grounds. I won't talk about that. I don't think it gets anywhere.
But it's a real puzzle, and furthermore that principle of using structural distance but not linear distance, is all over the place. You find it in structure after structure, in all languages, and it seems to just be some universal property of language, and a very puzzling one. Well discoveries of that kind are kind of reminiscent of the earliest moments of modern science. You go back to, say, 1600, for thousands of years, scientists had been quite satisfied with an explanation for such things as why stones fall down and steam goes up. They're seeking their natural place. End of that story.
When Galileo and a few others allowed themselves to be puzzled about that, you start getting modern science, and I think that continues. The capacity to be puzzled about what looks sort of obvious is a good property to cultivate. And it's an awakening, and that began at that time. And it turns out there's enormously linguistic evidence that this is the way things work. There's also, in recent years, some evidence from the neurosciences-- there's some interesting investigations being carried out by a very good group at Milan. The linguist in the group, Andrea Mauro, is well known here. He's been here many times.
They've been investigating the reaction of people to two types of stimuli, all new to them. Rule systems that meet the basic conditions of what's called universal grammar, of the general principles of language, and others that violate these conditions, and use things like linear order. So for example, an invented language in which, to negate a sentence, you take the negative word and you put it in the third position in the sentence, let's say. And it turns out that brain activity is quite different in these cases.
In the case of a normal language, which the people have never been exposed to, Broca's area, language areas are activated in the normal way. But in the case of linear order, it's not. The language areas don't have the normal activation. People may figure it out, but they're figuring it out as some kind of a puzzle to solve, not using their linguistic capacities. Well, all of this ought to be puzzling. And it is, and, in fact, as far as I know, the only serious proposal as to how to explain it has pretty far-reaching conclusions. The natural assumption is that at the point at which the computations are taking place in the brain, there just is no order. There's no ordering at all. All there is is hierarchy. So the only thing the brain can compute is minimal structural distance, because there's no linear order.
Well that looks pretty plausible. In fact, from many points of view. One reason is that if you look at the syntax and semantics of language, for a very broad core class of cases, order doesn't matter. What matters is hierarchy. And furthermore, if you look at-- so that suggests that one of those two interfaces, the semantic interface with the thought system, just didn't care about order. On the other hand, the sensory motor system must have some kind of arrangement depending on what the form of externalization is. If it's speech, it'll be linear order. If it's sign, it'll be various things in parallel. But it has to have some kind of ordering.
So it's reasonable to suppose that the ordering is simply a reflex of the externalization, but doesn't enter into the core properties of language, which simply give you what's used in the thought system. That, effectively, means, if it's correct, and it seems to be, that in the informal sense of the term design, the designer-- but in the informal sense of the term design, language is designed for thought, not for externalisation. Externalization is an ancillary process, and that would mean, incidentally, that communication, which is just a special case of externalization, is an even more ancillary process, contrary to what's widely believed.
That has significant implications for the study of evolution of language-- I won't go into them, but you can think them through-- and for the nature of design of language. Well there are other kinds of evidence that lead to similar conclusions-- it says a lot about the architecture of mind if this is correct. For example, one class of cases is a kind of ubiquitous phenomena in language of sometimes-called displacement. You hear something in one position, but you interpret it somewhere else.
So if you say what did John see? You have to interpret the word what in two positions-- where it is, where it's kind of an operator asking you some question about what person or something like that. And also in another position, as the object of see, just as in John saw the book or something, where it's getting its semantic role as the object of the verb. It's got to be interpreted in both positions.
It's as if the mind is hearing something like for which thing x John saw x, where the for which thing is an operator ranging over the variable. That shows up in other ways. That unpronounced element turns out to really be there for the mental processes that are taking place. You can see that in sentences like say, they expect to see each other, where the phrase each other is referring back to they, say the men expecting to see each other. On the other hand, if you had another noun in-between, in the right position, it wouldn't work.
So you say the men expect Mary to see each other, then that somehow breaks it up. Each other can't go back to the men. Well what about who do the men expect to see each other? Well that's like the men expect John to see each other. Each other doesn't go back to the men. But there's nothing there in the form that you hear. It's just like the men expect to see each other. Well that suggests strongly that the mind is actually hearing, interpreting for which a person x. The men expect x to see each other, and that x is like John when you pronounce it. And things like that can get pretty complicated.
So if you take a sentence like say, they think that every artist likes one of his pictures best. Like maybe the first one he painted. If you take one of his pictures, and you ask a question about it-- so which one of his pictures do they expect every artist to like the best-- the answer would be, well, the first one. Even though which of his pictures is not in the right position to be bound by the quantifier every. And that becomes clearer if you make a different question. Suppose you say which of his pictures convinced the museum that every artist paints flowers? Well there's no relation between every and his there, though it's structurally, if you think it through, about the same as the one that works.
There's really only one plausible interpretation for that, and that is that the phrase which of his pictures is actually appearing twice. It's appearing where it's getting its semantic role as the object of like, and it's appearing where it is being interpreted as an operator ranging over the variable. And these examples proliferate into pretty complicated cases when you look at them closely. It turns out, interestingly, that all of this follows if you develop an optimal computational system. One that has that meets the condition of perfect computational efficiency. It has the least possible computational operations. Then it turns out that what goes to the mind ought to have all these copies. They don't get erased.
On the other hand, what goes to the mouth, has them all erased, except for one, to tell you that it's there. That turns out to be optimal computation. But that, as a consequence-- it means that optimal computation, the core of language design, is oriented towards the meaning. The mapping to the transformation change to what comes out of the mouth is causing communicative difficulties. Anyone who's ever worked on a parsing program, trying to figure out how to mechanically interpret what a sentence structure and meaning of a sentence, knows that one of the hardest problems is what's called filler gap problems.
You hear the word what at the beginning of the sentence, and you've got to find where is the gap, the unpronounced element that this what is related to. If you pronounce them all, that wouldn't be a problem. Most of the cases would be solved. But efficient computation leads to the consequence that it undermines communicative efficiency. That's pretty much like the case of are eagles that flying-- are eagles that flying, swim. Nice thought, but you can't say it.
And again, optimal operation of the operations is leading to communicative difficulties. And quite generally-- there's an interesting class of cases known now where you have conflicts between computational efficiency and communicative efficiency. And computational efficiency wins hands down in every case that's understood. Which, again, suggests that the design of language is really for meaning. It's for semantic and intentional interpretation, how to organize speech acts and so on. And that the externalization of it, the fact that it sort of appears in the outside world somehow, is just an secondary process, communication, a tertiary process. Well that reorients our conception about how the mind works, but it looks pretty plausible.
Well another issue kind of alongside-- another minute? Yeah. Going back to the '50s, it was, of course, understood right away that it's impossible for any of these things to be learned by association, conditioning, induction, any simple mechanism. So it must be the case that these examples of the kind that I mentioned-- which are known very early by children, [? although ?] [? there's very ?] little evidence-- they must result from the other two possible factors in development. One of them is some genetic endowment, which is a species property. Apparently all humans share essentially the same endowment. No group difference is known.
So that's one, and the other is just laws of nature. The kinds of laws that determine say that a cell divides into spheres, not cubes, no genetics involved, some, but not much. And the interaction of these two-- factors ought to be able to yield results of this kind. By now there's a fair amount of evidence supporting that, the kinds of things I mentioned illustrated. Of course, language has differed all over the place. That was known too. So you have to show that these principles that you're postulating are going to apply universally.
That does not mean, contrary to a widely held misunderstanding that you can read in journals like Nature and Science, and others-- it does not mean that there are going to be descriptive generalisations about language that hold universally. In fact it might turn out there's no descriptive generalizations at all, and you have a very rich genetic structure. That's not the case, but it could turn out, because the things that you see are reflections of internal processes that are going on that meet other conditions like efficient computation. But it must be somehow you've got to apply it to everything.
Well in the '50s, in the golden age, a few languages were studied, the ones that people knew in RLE at the time. Actually, one of the first extensive studies was on an American Indian language, Hidatsa. The first dissertation in the department-- actually, it was in the EE department, because we didn't have a formal department then-- it was on Turkish. There was, of course, work on English. Morris Haley was working on Russian.
By the 1960s, as the department developed and people came in from all over, the range of languages investigated expanded enormously in the '80s and since, it's just exploded. There's now intensive investigation of languages, a very wide typological variety. All kinds of new problems being discovered, sometimes new answers. Sometimes they lead to further problems, and so on. At the same time, what Steve mentioned before, the study of language acquisition really took off. Actually Steve's work was part of the early work on this in the '80s, that, based on trying to discover how the options of variation of language-- which is all that has to be picked up by a child, the rest is fixed-- how those are set on the basis of data. By now, quite a lot of work on that.
All of this-- I'll just finish by saying-- that plenty of problems are made. I don't want to overstate, but there are some fairly striking and reasonably plausible conclusions, which have far-reaching implications. All of this, however, is restricted to what's going on inside the internal computing system. What's going on inside the mind. Now somehow language has to be related to the outside world, because there are two interfaces, it's going to have to be related in two different ways.
At the sensory motor interface, the question is hard, but sort of understood. That's the topic of acoustic and articulatory phonetics, which has been studied pretty intensively, orally too, for 60 years, and a lot of results, a lot of problems. But at least the problems kind of in hand. What about the other end? How does the meaning side relate to the outside world? Well there is a doctrine that's widely held-- I'm supposed to give an answer to this. It's sometimes called the referentialist doctrine. It holds that the elementary semantic units-- let's say, words, for simplicity-- they relate to the outside world via relation of reference, picking something out. Or denotation for predicates. Picking out a class of things. So like the word cow picks out cows, where cows are something that a physicist could identify in principle.
The only problem-- and that's very widely held. It was known in classical Greece that that's not correct. They didn't have much of a deep analysis, but examples showing it doesn't work. In the 17th century, in what we ought to call the first cognitive revolution, many of the significant questions were raised, that are still puzzling today. It was widely understood that that doesn't work, that as David Hume put it, summarizing a century of inquiry, that the things we categorize, he said, are fictitious. Created by the mind, using Gestalt properties, notions of cause and effect, notions of psychic continuity for sentient objects. John Locke studied this.
And other properties that a physicist cannot identify. They're imposed by the mind on the flux of experience. And those are the entities, if you like, sort of mental entities that we construct. And they're somehow related to the world. Now this poses a huge problem. There's no animal communication system known that has anything remotely like this. There the referentialist doctrine apparently holds.
And so if you take, say, cries of a vervet monkey, each cry is keyed to some identifiable physical event. Like leaves are moving-- we interpret that as meaning an eagle's coming. All the vervets run and hide. I don't know how they interpret it. It's another question. Or else it's an internal sort of hormonal phenomenon, like, I'm hungry, or something like that. Every communication system that's known works like that.
But human language doesn't work like that at all. You take a look at the words of the language-- none of them work like that. And that's a real problem. For one thing, it's a tremendous evolutionary problem-- where did all this come from. Totally unknown and maybe unknowable, but it's also a problem for the nature of language and for the acquisition of language. How did children find these properties which they get to understand very quickly? In fact, things like children's fairy tales exploit them effectively. Well, these are major mysteries that remain on the horizon, right at the core of what language and thought are about. And there are plenty of other ones, so I'd just suggest again that the capacity to be puzzled is a good one to cultivate.
[APPLAUSE]
PINKER: Thank you very much, Noam. Emilio Bizzi.
BIZZI: Well one of the-- can you hear me? Yep. Of the goals of artificial intelligence, as I understand it, was to build intelligent machines. And there's a class, of course, of intelligence machines that move, like robots. And so what I'm going to discuss is movement not in machines but movement from a biological perspective. And then, toward the end, I will discuss the interaction between artificial systems like robots, and biological systems that move. But for the initial part of my discussion here, I like to point out something that I can illustrate with an example.
Let's say that I want to reach this pair of glasses. Well this is something extremely simple. We do it hundreds of times per day. We have done it for thousands and thousands of times. So what's the big deal about it? Well this simple gesture here points out an important fact and that is that what seems to be a very simple action, in reality, the neural process that subserve, that make it possible for this action to be expressed, are of great complexity.
And this complexity is, to a certain extent, mysterious. We know a little bit about it. We know that some of the problems, some of the computations are beginning to be seen. But there are an enormous amount of things that have to do with actions that, at this point, remain fairly mysterious. So let me start with what we have glimpsed recently, in the last few years. And that has to do with-- and I go back to this example. If I want to touch this pair of glasses, then I have to do in a number of things simultaneously. I have to move my eyes, my eye muscles, my neck muscles, my trunk muscles. And of course, then the arm muscles.
Now from an anatomical point of view, all these muscles are made of elements which are the muscle fibers. And each group of muscle fibers receives a fiber from the central nervous system, and particularly from, in this case, from the spinal cord. So what we have here is that to implement this simple gesture, the central nervous system has to control an enormous space, an enormous number of elements. That is, the muscle fibers that make up the muscles. So this is, in a sense, a gigantic problem, because we have an extremely tough computational problem-- how to arrange this distribution of signals to a very vast space.
Well in the last few years, a number of investigators have identified elements of the architecture of the central nervous system that deal with movement that has indicated a modularity. And this modularity has been identified predominantly in the spinal cord. It's maybe also in the upper part of the brain stem, but in any case, most of it is in the spinal cord. And what it does-- it means that there are groups of interneurons that manage to put together, in a unit, a group of muscles.
What does that mean? It means that the number of degrees of freedom, this vast space, has been reduced dramatically, in such a way that the descending fibers from the cortex that convey from the brain down to the spinal cord. The information for movement-- all they have to do is to activate various modules in the spinal cord, combine them, and provide coefficient of activations for each one. And this view of how the brain manages to be so effective in producing movement is derived from experimental work that, in this lab, or in my laboratory, and in other laboratories, has been performed in the last 10 years.
Now there are other things that are more mysterious. When I do the simple movement here, this movement is learned, has been learned somehow during the course of life. The site of learning, motor learning, is certainly the motor cortex of the brain. So the areas of the frontal lobe.
Now these areas of the frontal lobe are, of course, connected with subcortical nuclei, with the cerebellum, and so on. And certainly all these areas, in conjunction with the connectivity of the spinal cord, represent the circuitry that encode the signals, the memory signals, that are necessary to do this simple action.
But here there is a catch. When I do these simple actions, I can do them like in front of you now, but I can do it by moving my body in a different posture. I can be reclining, and nevertheless, I can accomplish the same goal. So it's very tough to understand how the signals that have been memorized which specify particular muscles that have to be activated in their particular context can, in another context, still be just as effective. So this is a question of generalization.
And I don't understand how the central nervous system manages to do this process of generalization so effectively. There are also other computational problems that are quite difficult to see how they are implemented in the central nervous system. And that is-- I can reach this pair of glasses. If there is an obstacle, I can go up this way. I can go down this way. So this is a question of planning, how vertebrates are extremely good at planning their action, depending on their environment.
And how again the memory that guides the movements-- how can that be translated into signals modified in order to accomplish pathways that, each time, are potentially different. It depends on the environment. Now there are other things that are somewhat less mysterious. We know that among the vertebrates, a certain amount of learning goes through by imitation.
And recently, some light has been shed on this process of learning by imitation. Neurons have been found in the frontal lobe that discharge when the subject sees an action, and when the subject repeats that action, that similar action. So these are called mirror neurons, and they are an important feature of the motor system that seem to provide the basis for learning by imitation.
Now although there are all these tough computational problems that need to be understood, I am optimistic that in the next few years, we will make a lot of progress. And the reason is that there are many laboratories scattered in various parts of the world that are pursuing the issues of humanoid robotics. That is, what they're trying to do is to put these properties that I've described to you, generalization, planning, learning by imitation, and so on, in machines.
And I don't know how far this effort-- how much these people have accomplished, but nevertheless, the fact that there is this intellectual attitude toward implementing in machines-- try to find the computational solutions that provide these properties to machines is a way to really understand when you start to make things, beginning to face the computational problems and understand.
There is also-- and I'll finish in a second, soon. There is also a new field, which is somewhat promising, and that is to place sets of microelectrodes into the brain of-- it has been tried with humans, and is usually tried in animals-- and connect the output of the signals that come out of these microelectrodes and connect it with machines.
So these are brain-machine interactions, which obviously has tremendous importance, both in understanding the workings of the central nervous system and also the practical understanding, the practical significance in rehabilitation of amputees and so on.
Now, for the time being, there are technical problems that prevent the quick implementation of brain-machine interactions. And those who have to do with the brain reacts to the permanent content with these electrodes in the brain. Essentially, they are rejected after a while.
But it's conceivable that more, different technologies could be used in order to try to get around this problem. So this is two reasons why I have a certain amount of optimism that, for the future, this motoric intelligence will be understood. Thank you.
[APPLAUSE]
PINKER: Thank you, Emilio. Barbara Partee.
PARTEE: For me, the adventure began just 50 years ago. Here at MIT, in 1961. The Chomskian revolution had just begun, and Noam Chomsky and Morris Halle had just opened up a PhD program in linguistics, and I came in the first class. I want to start by thanking Chomsky and Halle for building that program. And I thank MIT and the Research Laboratory of Electronics for supporting it. I'm indebted to Chomsky for revolutionizing the field of linguistics and making it into a field whose excitement has never waned. Chomsky redefined linguistics as the study of human linguistic competence, making linguistics one of the early pillars of cognitive science.
At the center of the Chomskian revolution was syntax, generative grammar, a finite specification of the infinite set of sentences of a language. That launched an extremely productive research program, but it didn't include semantics. Chomsky considered meaning scientifically intractable, and he wasn't alone. When linguists did soon try to add semantics to generative grammar, there were two big obstacles-- a lack of good formal tools, and the absence of any non-subjective notion of meaning.
It was the UCLA logician Richard Montague who revolutionized the study of semantics, in the late 1960s. He built on successes in formalizing the syntax and semantics of logical languages. The work of Frege, Tarski, and others. Montague himself developed a higher order typed intentional logic with a model theory. And with its help, he was able to formalize a significant part of the syntax and semantics of English within a clearly articulated theoretical framework. The result was what we now call formal semantics.
Emmon Bach has summed up Chomsky's and Montague's cumulative innovations thus-- Chomsky showed that English can be described as a formal system. Montague showed that English can be described as an interpreted formal system. For me, as a young syntactician at UCLA in the late '60s, Montague's work was eye-opening. Linguists had been building tree-like semantic representations. OK for capturing certain things like scope ambiguity, but otherwise leading to endless arguments about competing representations.
We linguists never dreamed of truth conditions being relevant for linguistics, but Montague's work and David Lewis's showed that truth conditions are crucial for giving semantics a non-subjective foundation. A linguistically exciting part was that with such a rich logic, we could get a close match between syntactic constituents and semantic constituents, and meanings of sentences could be built up recursively and compositionally from the meanings of their syntactic parts.
One brief illustration-- Bertrand Russell had argued that natural language syntax is logically very misleading, since it puts John and every man into the same syntactic category of noun phrases. Whereas in first order logic, John walks, and every man walks, must have radically different syntactic structures. Montague showed that natural language syntax does not have to be logically misleading at all.
All noun phrases can be treated uniformly in higher order logic as so-called generalized quantifiers. In English subject-predicate sentences like John walks and every man walks can then be interpreted in a similar manner straightforwardly. Generalized quantifier theory is now one of many rich topics in formal semantics, and it can explain many things that we couldn't explain with just syntax.
I started working on Montague grammar in about 1970, to try to integrate it into linguistics by finding ways to put Chomsky's and Montague's work together. From the early '70s, collaborative work among philosophers, logisticians, and linguists in the US and Europe built in many directions. One important result is that with a serious semantics to share the load, syntax doesn't have to do all of the work that we once imagined. Syntactic and semantic advances now often develop in tandem, each informing the other.
By the late 1980s, formal semantics was well-established within linguistics departments in the US. In Europe, it may be in linguistics or in philosophy. Among middle generation leaders of formal semantics, I'll just mention two who are involved in this symposium, and who were my PhD students at U Mass in the 1980s. Irene Heim, head of the Department of Linguistics and Philosophy here, became MIT's first formal semantics faculty member in 1989. And [? Jharna ?] [? Kryka ?] is now head linguistics at Harvard. Both have made seminal contributions that have helped shaped the field.
I have to mention one seeming paradox which relates to open challenges as well as to past progress. I stressed how the Chomskian revolution made linguistics an early pillar of cognitive science. Yet Frege's anti-psychologist stance, shared by Montague, was crucial in the foundations of formal semantics. Frege argued that truth conditions and not mental ideas have to be at the core of the meaning of a sentence.
And the work of generations of linguists and philosophers has shown the fruitfulness of that approach. First we have to formalize what the semantics of our language is; what our sentences say about how the world is. Then figure out how our knowledge of it, such as it is, is acquired and exploited.
This stance might seem to exclude formal semantics from cognitive science, but I believe on the contrary, it makes the contributions of semantics to cognitive science all the more interesting. Human language is a remarkable achievement. Part of what's remarkable is how we implicitly recognize and navigate the social construction of meaning. When we converse, we simultaneously exchange information and negotiate meaning.
David Lewis's classic work on convention was an early milestone in exploring the relation between individual competence, what's in the head of one language user, and social intelligence, an important dimension that is still probably under-explored. As for newer directions, the best understood parts of formal semantics concern the semantic compositions of parts from wholes. Sorry, of holes from parts, of course.
The semantics-- I proofread this several times-- what we might call the semantics of syntax. There has also been great progress on formalizing parts of pragmatics, involving the interaction of meaning, language use, and context, including the study of how context both affects and is affected by interpretation.
And studies in language processing now include formal semantics and pragmatics, and game theoretic approaches are having a growing influence. Computational formal semantics is now a subfield of computational linguistics, contributing to both theoretical and applied goals. And there's promising early work on universals and typology in semantics with innovative fieldwork, uncovering semantic and pragmatic subtleties of various indigenous languages. And just as in syntax, such languages prove to be as semantically rich and complex more familiar European languages.
There are many challenges facing formal semantics and its relatives. These are still young fields. I'll mention just two that I think are important both for the field and for the goals of this symposium. First, the semantics of sentential structures is increasingly well-understood, but lexical semantics is really hard. And formal methods are weakest here. It's in lexical semantics that we most clearly cannot equate the language with our knowledge of it.
As Hilary Putnam observed years ago, the fact that he doesn't himself know anything that would distinguish a beech tree from an elm tree does not mean that beech and elm are synonyms in his language. The lexicon has lots of linguistically important substructure, but it's also the part of language that interfaces most strongly with non-linguistic knowledge, with encyclopedic knowledge, with our common sense ontology, with visual imagery, with cultural associations, and on, and on.
And connected with that, another challenge is how to build formal semantics into real-time processing models, whether we think of psychological models of how we do it, or computational models that might do it in an applied way. Models that involve the integration of linguistic and not specifically linguistic knowledge.
This concerns not only lexical semantics, but also all kinds of context dependence, implicit content, meaning shifts, and non-literal uses of language. I know there is progress. I'm not an expert in those areas, but I believe this is a domain where major interdisciplinary innovation and cooperation may be crucial.
Just an illustration of the challenges. I'm pretty sure that the most successful search engines do not do any formal semantic analysis of questions that we ask or of texts that they are searching. I've heard of some beginnings of some nice work in that direction, but large-scale, all-purpose, fast search engines or "intelligent machines" that really know and use semantics as we study it are probably still far in the future.
In conclusion, I would suppose that really knowing semantics is a prerequisite for any use of language to be called intelligent. So if there is to be a new basic research effort to understand intelligence and the brain, semantics is ready to contribute to it and to gain from it. Thank you.
[APPLAUSE]
PINKER: Thank you Barbara, and Patrick Winston I think you're going to be a motile organism, as I recall. Or would you like to use the podium?
WINSTON: I think I'll not use that. I've never very been very good at introductory statements, so I haven't prepared one. I'll ask myself a few questions instead. And my first question is, well, what do you think so far? And that's a difficult question for me to answer, because several of the speakers-- you may have noticed-- ran out of time.
I was particularly disappointed that Marvin was unable to give us a detailed account of his assessment of the progress that's been made in artificial intelligence in the past 20 years. So I've decided to dedicate a little of my time to doing a simulation of Marvin, and here it is. And that concludes my simulation of Marvin's account of the past 20 years.
Now it's certainly the case that many people would contest the view that there's been no progress, but I don't think anyone would contest the view that there could have been more progress in the past 20 years than there has been. And I think it's informative to think about why that might be, because we have no lamp to guide us into the future except the lamp of what has passed before.
And I think that, in my view, what went wrong went wrong in the '80s. And the first thing that went wrong is that we all discovered artificial intelligence technology was actually useful. And that led to a kind of commercial distraction, from which the field has never recovered. So that's one problem.
Another problem, of course, is that that happened to be the decade in which the Cold War ended, and as a consequence of that, there were shifts in funding patterns that contributed to the problem, because sponsors were no longer as interested in being princes of science, but rather became much more interested in short-term goals, demonstrations and the like.
But I think the most insidious problem was what I'd like to label the mechanistic Balkanization of the field. What happened in that period was that artificial intelligence as a field matured and began to break up into communities that were coagulating around mechanisms rather than problems.
So it was in the '80s that we began to see conferences dedicated to neural nets. Conferences dedicated to genetic algorithms. Conferences dedicated to probabilistic inference. And when you dedicate your conferences to subjects of that kind, to mechanisms, then there's a tendency to not work on fundamental problems, but rather those problems that the mechanisms can actually deal with.
This tended to affect not only the conferences that were being held but also the kind of jobs that were offered by universities. Those job descriptions tended to have phrases in them like neural nets, probabilistic inference, genetic algorithms, and the like.
I think it was in about 1995 that I had a horrible nightmare. As a consequence of this, I was on the search committee for the electrical engineering computer science department at the time. And in my nightmare, we had put out a call for people interested in making an application to work at MIT as a professor in the electrical engineering and computer science department, and Jesus applied.
[LAUGHTER]
WINSTON: And after due consideration of the application, we decided to send the standard letter. Thank you for your interest in our department. We regret that with over 400 applicants, we cannot interview anyone, and besides, this year, we're not looking for a theologian.
It was because of that kind of Balkanization, which focused everyone on particular, rather narrow elements of the field. So what to do? Some people say, well, Winston, the trouble with you is you're just a hopeless romantic about the past, but that's not entirely true, because I'm also romantic about the future. It's just right now that may not be so hot.
And I think the future does offer some hope, and you might say, why. And I think the reason is because we now are gradually becoming less stupid about what to do. So you might say, well, what is it that you think we should do? And then my response would be, well, I think we should take the right approach. And then you would say what is the right approach. What do you call your approach? And I would reply, well, I call my approach the right way. And then you might say, well, what does your approach have to offer.
And then I would say that what my approach has to offer is a focus on the fundamental differences between our human species and other primates, and, in particular, to others of the homo genus. And you might say well what are those properties that separate us from, for instance, the Neanderthals. And who knows, but if we want a clue, the best thing to do is to ask people who study the question.
So I've been much influenced by, for example, the paleoanthropologist Ian Tattersall, who tells us that we as a species co-existed quite happily with the Neanderthals for about 100,000 years, and during that time we made approximately the same kind of stone tools, and during that time, there was approximately no progress in our making of stone tools.
But then, about 60,000 years ago or so, something magical happened to the human species. It probably happened in southern Africa, and it probably happened in a neck down population of maybe many hundred, but certainly not more than a few thousand individuals. And once we got that new property, it spread throughout the human population.
And within 10,000 years, we had either wiped out or out competed the Neanderthals. Wiped out a lot of species along our path throughout the world, and never looked back. And so what is it then that was that magic property. Once again, who knows, but we can speculate. And I've heard Noam say things like it was the capacity, perhaps to take two concepts and put them together to make a third concept, and to do that without limit, and without disturbing the concepts they got merged together.
I think that's probably right, and I would hazard to layer a piece on top of that, because I think that that capability in turn makes it possible for us to describe events. And once we can describe events, it's natural to develop a capacity to string them together into stories. And once we have stories, then we can take two stories and blend them together to make another story. And once we have that, we have at least a kind of creativity. And once we have that we have a species it's not like, much different from any other.
So I've come to believe that that story understanding is one of the great things that's enabled by the inner language that's underneath our communication language. And I think that the past 50,000 years has been a period of time in which we have become increasingly awash in stories. We're soaked in stories from our birth through our death. We start with fairy tales. We continue on through religious parables. We go on to literature and history. We end up, perhaps, with law school or medical school, and throughout that whole period of education, we're essentially accumulating and learning to manipulate the stories that come to us in our culture and in our environment.
So that's what I think ought to be done. And now you might ask a natural question-- what kind of property do you need in order to study the particular questions that derive from this approach? And the answer is that there's only one property that you really need, and that is, I think, naive optimism in the face of overwhelming contraindication.
But that's OK. I don't mind that, because, at the very least, I think that through the study of stories, we will develop at least sharper questions to ask in the next generation. I also think that it's possible to do the kinds of things that I do better today than I was able to do them some years ago, because I think I've gradually become less stupid about the steps that need to be taken in order to make progress.
So here are the steps that I think you need to take in order to make progress. First of all, you need to characterize the behavior. Next, after you've characterized the behavior, you need to formulate a computational problem. And once you've formulated a computational problem, then you can take step three, which is to propose a computational solution. And all of these things are the natural objectives of classical psychology.
But then there's a fourth step. Once you have the proposed solution to the computation problem, then the kind of thing that I do I could refer to as exploratory development-- building systems that you believe will have the behavior that you have characterized in virtue of the computational solution that you have proposed. And when you do that, you discover, to your horror and amazement, that you left things out. That your solution is incomplete, and you have to go back through that cycle again, and again, and again.
So I think that's four steps. The computational problem, or rather the behavior of the computational problem with all the constraints necessary to solve it. A proposed solution, exploratory development, and as one of my students, [? Leona ?] [? Federev ?], mentioned to me this morning, that isn't enough either, because at step four, you might be in the same situation that the Romans were when they had catapults. You can improve them, but you didn't understand them until you had a notion of Newtonian mechanics.
So the fifth step then is you have to look back over the first four steps and crystallize out the principles that made everything possible. So that's what, I think, ought to be done. And that's the sort of thing that I do. But as I say that, I think it would be easy to suppose that I think everything happens in symbol land, merely because we are a symbolic species.
But I don't think everything happens in symbol land because I think we make use of the stuff that was there before. In particular, we have to solve a lot of problems with our perceptual apparatus. So I could say, for example, to you Sebastian kissed Aaron. And then I could ask did Sebastian touch Aaron. And you would immediately reply, yes, of course, because you can imagine it. And you don't feel like you're doing syllogistic reasoning. You feel like you're imagining the situation and reading the answer off of the imagined situation with your perceptual apparatus.
And I think that's right. I think we know a lot of things latently, and we don't have to be told them. We don't have to be told that if you kiss somebody, you touch them, because we can figure that out on the fly. And then of course we can cache it in our symbolic systems if we want, but the point is that we can compute that stuff when we need it. So we know vastly more than what we actually know.
Here's another example that I am very fond of. I borrowed it from Shimon Ullman. It's the example of well-- what am I doing? It's not a trick question. I'm drinking. Now what am I doing? I'm toasting. Now what am I doing? I'm admiring this fine Davide Fuin glass.
But then for the next example, I'd have to ask you to exercise your own power of imagination, because I don't have a slide that does it. What I want you to imagine is that I'm a cat, and my mouth is pointed to the sky. And up there, above my mouth, is a leaking faucet. What am I doing? I'm drinking, even though that kind of drinking doesn't look visually anything like the kind of drinking I did with my first example. Yet somehow we label those both as drinking.
So my view is that it must be the case that this kind of drinking and that kind of drinking is telling the same story. That there's a story about thirst. That there's a story about emotion. There's a story about liquid and movement through space into your mouth, and that's the story of drinking, and you can recognize it from a single frame as a drinking story, even though it's very different from other frames of other videos you've seen that tell the drinking story.
So that's what I think, and the last question I'll ask myself is what I've tried to contribute here today. And the answer is I've tried to talk a little bit about why I think we haven't made progress in the field to the extent that we should have. What I think we ought to do now in order to make progress, and the steps that I think need to be taken by people who are trying to make that progress, and an emphasis too on the idea that we're not just a symbolic species-- our symbolic nature and our inner language capability make possible an interaction with our perceptual apparatus that I think would not be possible in a dog or a chimpanzee.
And finally, I think as part of the contribution package, I'd like to say that I think that if we do all this, then we'll have a much better understanding of how we work, and I think that will lead to what I would like to call, with perhaps a touch of irony unthinkable value to the species.
[APPLAUSE]
PINKER: Thank you Patrick. Tommy Poggio tells me that we have another 15 minutes, and I would hate to let this opportunity pass with this extraordinary collection of speakers without some exploration of perhaps some themes that crosscut some of the presentations. I'm going to take the liberty of asking a couple of questions myself to try to draw out some connections among the fascinating presentations we just heard.
And I'll start with Sydney Brenner. You have inaugurated the study of the nervous system of a creature with 302 neurons. Many of us are interested in the neuroscience of a creature with 100,000,000,302 neurons, give or take a few thousand. And we have heard, this very evening, suggestions that the genetic code contains information that is exploited in the development of the capacity, say, to use human language, together with the thoughts that are expressed in language.
You referred jocularly to the Chomsky gene, the single hypothetical gene that turns us into a human. But more seriously, do you have any thoughts on how we might eventually narrow the gap between systems like C. elegans, where we clearly have an enormous amount of understanding, and the organism that many of the rest of us are interested in and the possible information in the genome that results in the innate unfolding of cognitive abilities?
BRENNER: Well I think it has to go beyond the hard wired systems. And I think there is a biological invention that occurred which allowed us to expand the possibilities of the system. If I can give one example of this, I think that a thing that would be extremely interesting to ask is it just going to be more of the same stuff that somehow culminates in intelligence and all the other properties associated, or is there is some revolutionary change in principle.
So I think the important thing is that once we get away and have a way of building bigger nervous systems, we have the potential then to have not fixed it. So there are many such systems in biology-- the immune system says you can learn things from the past, which is by animals being selected because of things. But you must have a way of meeting unforeseen circumstances.
And I think we have to have something like that. It needn't be coded like the immune system. We have to have a flexible system in which we can achieve the result but we need additional input of experience, if you like. So perhaps our genomes only decide how to make a baby. And all of the rest of it has to follow. But they give you the capacity to do these things. And that, I think, is a very interesting thing.
On the other hand, I am still very interested in the accumulation that comes from the processes of evolution. And evolution, of course, naturally selects for organisms. And so to speak, guarantees their survival. In fact, it could be argued that the trouble with us as human beings is that we stopped biological evolution. Where sitting inside of ourselves is a quarter of a million year old gene totally unsuited, maladapted to the environment that we've created as part of history.
And I think we should be clear about that, because it's still, if you like-- deep down inside, there is an animal. And the animal is still very much like a rat or a mouse. Of course, it doesn't do the same things. But it is designed. A system designed for function, for survival.
I think what we've been able to do by the expansion of other parts of the brain, is, in fact, in many cases, to be able to suppress that. I think it's a very good example of you can receive this very clearly, in that's why I think the basic part of the brain that we can still explain from the genome and all of these other things resides in the hypothalamus. The hypothalamus is Freud's id. All right?
Now what happens is the frontal cortex sends inhibitory stimulus down to the hypothalamus and says stop behaving like a beast. So all those things which we regret in our constitution like greed, lust, desire, avarice, those are encoded, really in the genes that affect the hypothalamus. What we have is this inhibitory thing. So that means that we can suppress this. And sometimes we don't want to, or can't, so what.
So I think that the layering of the system which distinguishes that, and therefore you will have to work in this field here that looks at this. Everybody knows that if you take a rat and let it smell a cat-- it has to smell it-- that's encoded in the genome, and it undergoes complete panic.
Of course, you can condition that to the vision of a cat. Then you show the animal a cat, complete panic. You take it out, and you just destroy 3,000 cells in its hypothalamus. And you put it back in the cage. You can show it the cat. It'll saunter out of this, walk in front of the cat.
The amusing thing-- now the cat is perplexed. However, I'll just give you that. And I think a lot of this is, I think, in this. And I think this is where biology can tell you what the ground processes-- what's encoded in our genes. Many things are like that.
But I think the rest of it is all the other stuff that we've seen. So we have, if you like, a machine within a machine. We have added on top of this a machine which is capable of flexibility, which is capable of changing it. So that's why I don't believe that tracing out all the connections-- I think that's a form of insanity-- and then we'll be able to discover exactly this. I think that is a form of insanity.
I mean today nobody can describe to you in detail all the physical events occur in a computer chip, when you take a picture in your camera which has the telephone attachment to it. s And nobody wants to describe it anymore, because the basic principles are there-- we've got above it,
So I think we have to look at those, still with this old thing of layers of organizations. And we must learn to know what we can do, and what others can do together with us. That's why it's got to be multiple-disciplinary.
PINKER: Thank you. I have a question for both Noam and Marvin that I think is on the minds of many people in this room, and I know it's been expressed in some of the questions that I've received by e-mail prior to this event-- that there is a narrative in which the new direction of both artificial intelligence and cognitive science is one that makes a great deal more use of probabilistic information that is gleaned from enormous amounts of experience during learning.
This is manifested in branches of cognitive science such as neural networks and connectionism, Bayesian inference models, application of machine learning to intelligence, many of the models both of Tommy Poggio and Josh Tenenbaum.
In the classic work from the golden age, and indeed, in many of the models since then, including the models of generative grammar and models of semantic memory, probabilities don't play a big role. Is the narrative that says that the direction of the field is in making use of massive amounts of statistical information via learning-- well maybe I'll ask you to complete that sentence, what is the-- I'll let you complete the sentence. Noam?
CHOMSKY: It's true. There's been a lot of work on trying to apply statistical models to various linguistic problems. I think there's been some successes, but a lot of failures. The successes, to my knowledge at least-- there's no time to go into details, but the successes that I know of are those that integrate statistical analysis with some universal grammar properties, some fundamental properties of language. When they're integrated, you sometimes do get results. The simplest case maybe is the problem of identifying words in a running discourse.
It's something apparently children do. They hear a running discourse; they pick out units. And the obvious units are words, which are really fun logical units. And there's a natural property-- I wrote about it in the 1950s, I mean it was just taken for granted-- that if you just take a look at the-- if you have a series of sounds, and you look at the transitional probabilities at each point, what's likely to come next.
When you get to a word boundary, the probabilities go way down. You know what's coming next. If it's internal to a word, you can predict pretty well. So if you kind of trace transitional probabilities, you ought to get something like word boundaries. I wrote about it in 1955, and I assumed that that's correct. Turns out to be false. It was shown, actually by Charles Yang, a former student here. A PhD in the computer science department.
If you apply that method, it basically gives syllables, in a language like English. On the other hand, if you apply that message under a constraint, namely the constraint that a word has what's called a presodic peak, like a pitch stress peak, which is true, then you get much better results. Now there's more recent work, which is still in press, by Shukla and Aslin and others, which shows that you get still better results if you apply that together with what are called presodic phrases.
It turns out that a sentence-- let's say it has units. Units of pitch, and stress, and so on, which are connected, related to the syntactic structure. Actually, in ways which were studied maybe first seriously by another former student Lisa Selkirk, a colleague of Barbara's.
But when you connect-- when you interact prosodic peaks with transitional probabilities, then you get a pretty good identification of word boundaries. Well that's the kind of work that I think makes sense. And there are more complex examples, but it's a simple example of the kind of thing that can work.
On the other hand, there's a lot of work which tries to do sophisticated statistical analysis. Bayesian, and so on, and so forth, without any concern for the actual structure of language. As far as I'm aware, that only achieves success in a very odd sense of success. There is this notion of success which has developed in computational cognitive science in recent years, which I think is novel in the history of science. It interprets success as approximating unanalyzed data.
So for example, if you were to, say, study bee communication this way, instead of doing the complex experiments that bee scientists do, like having bees fly to an island to see if they leave an odor trail and this sort of thing-- if you simply did extensive videotaping of bees swarming, and you did a lot of statistical analysis of it, you would get a pretty good prediction for what bees are likely to do next time they swarm.
Actually, you'd get a better prediction than bee scientists do, and they wouldn't care, because they're not trying to do that. And you can make it a better and better approximation by more videotapes, and more statistics, and so on. Actually, you could do physics this way. Instead of studying things like balls rolling down frictionless planes, which can't happen in nature-- if you took a ton of video tapes of what's happening outside my office window, let's say-- leaves flying and various things-- and you did an extensive analysis of them, you would get some kind of prediction of what's likely to happen next. Certainly way better than anybody in the physics department could give.
Well that's a notion of success, which is, I think, novel. I don't know of anything like it in the history of science. And in those terms, you get some kind of successes. And if you look at the literature in the field, a lot of these papers are listed as successes. And when you look at them carefully, their success is in this particular sense, and not the sense that science has ever been interested in. But it does give you ways of approximating unanalyzed data. Analysis corpus, and so on, and so forth. I don't know of any other cases, frankly.
So there are successes where things are integrated with some of the properties of language, but I know of none in which they're not.
PINKER: Thank you. To my tremendous regret, and I imagine of many people in the audience, I'm going to have to bring the proceedings to a close, and we'll have to be left with Patrick's interpretation of what Marvin's response would have been to the question.
I'd like to thank Sydney Brenner, Marvin Minsky, Noam Chomsky, Emilio Bizzi, Barbara Partee, Patrick Winston for an enormously stimulating session. And thanks to all of you.
[APPLAUSE]