Jay McClelland: Neural Networks and the Emergence of Cognition

Lex Fridman Podcast

Conversations about science, technology, history, philosophy and the nature of intelligence, consciousness, love, and power. Lex is an AI researcher at MIT and beyond. Conversations about science, technology, history, philosophy and the nature of intelligence, consciousness, love, and power. Lex is an AI researcher at MIT and beyond.

Transcribed podcasts: 442
Time transcribed: 44d 12h 13m 31s

results.

Mention graph

This graph shows how many times the word ______ has been mentioned throughout the history of the program.

The following is a conversation with Jay McClelland,
a cognitive scientist at Stanford
and one of the seminal figures
in the history of artificial intelligence
and specifically neural networks.
Having written the parallel distributed processing book
with David Romo Hart,
who co-authored the back propagation paper
with Jeff Hinton.
In their collaborations, they've paved the way
for many of the ideas
at the center of the neural network-based
machine learning revolution of the past 15 years.
To support this podcast,
please check out our sponsors in the description.
This is the Lex Friedman podcast
and here is my conversation with Jay McClelland.
You are one of the seminal figures
in the history of neural networks.
At the intersection of cognitive psychology
and computer science,
what do you has over the decades emerged
as the most beautiful aspect about neural networks,
both artificial and biological?
The fundamental thing I think about
with neural networks is how they allow us to link
biology with the mysteries of thought.
And when I was first entering the field,
myself in the late 60s, early 70s,
cognitive psychology had just become a field.
There was a book published in 67 called Cognitive Psychology.
And the author said that the study of the nervous system
was only of peripheral interest.
It wasn't gonna tell us anything about the mind.
And I didn't agree with that.
I always felt, oh, look, I'm a physical being.
I, from dust to dust, ashes to ashes
and somehow I emerged from that.
So that's really interesting.
So there was a sense with cognitive psychology
that in understanding the sort of neuronal structure
of things, you're not going to be able to understand
the mind and then your sense is if we study
these neural networks, we might be able to get
at least very close to understanding
the fundamentals of the human mind.
Yeah.
I used to think, or I used to talk about the idea
of awakening from the Cartesian dream.
So Descartes thought about these things, right?
He was walking in the gardens of Versailles one day
and he stepped on a stone and a statue moved.
And he walked a little further, he stepped on another stone
and another statue moved.
And he like, why did the statue move
when I stepped on the stone?
And he went and talked to the gardeners
and he found out that they had a hydraulic system
that allowed the physical contact with the stone
to cause water to flow in various directions,
which caused water to flow into the statue
and move the statue.
And he used this as the beginnings of a theory
about how animals act.
And he had this notion that these little fibers
that people had identified that weren't carrying the blood
were these little hydraulic tubes
that if you touch something that would be pressure
and it would send a signal of pressure to the other parts
of the system and that would cause action.
So he had a mechanistic theory of animal behavior.
And he thought that the human had this animal body,
but that some divine something else
had to have come down and been placed in him
to give him the ability to think, right?
So the physical world includes the body in action
but it doesn't include thought according to Descartes, right?
And so the study of physiology at that time
was the study of sensory systems and motor systems
and things that you could directly measure
when you stimulated neurons and stuff like that.
And the study of cognition was something that was tied in
with abstract computer algorithms and things like that.
But when I was an undergraduate,
I learned about the physiological mechanisms.
And so when I'm studying cognitive psychology
as a first year PhD student, I'm saying, wait a minute,
the whole thing is biological, right?
You had that intuition right away.
That was seemed obvious to you.
Yeah, yeah.
Isn't that magical though that from just
the little bit of biology can emerge the full beauty
of the human experience?
Is it, well, why is that so obvious to you?
Well, obvious and not obvious at the same time.
And I think about Darwin in this context too
because Darwin knew very early on
that none of the ideas that anybody had ever offered
gave him a sense of understanding
how evolution could have worked, but he wanted
to figure out how it could have worked.
That was his goal.
And he spent a lot of time working on this idea
and coming, you know, reading about things
that gave him hints and thinking they were interesting
but not knowing why and drawing more and more pictures
of different birds that differ slightly from each other
and so on, you know, and then he figured it out.
But after he figured it out, he had nightmares about it.
He would dream about the complexity of the eye
and the arguments that people had given
about how ridiculous it was to imagine
that that could have ever emerged from some sort of,
you know, unguided process, right?
That it hadn't been the product of design.
And so he didn't publish for a long time,
in part because he was scared of his own ideas.
He didn't think they could possibly be true.
But then, you know, by the time
the 20th century rolls around, we all,
you know, we understand that, or many people understand
or believe that evolution produced, you know, the entire
range of animals that there are.
And, you know, Descartes' idea starts to seem
a little wonky after a while, right?
Like, well, wait a minute.
There's the apes and the chimpanzees and the bonobos
and, you know, like, they're pretty smart in some ways,
you know, so what, oh, you know, somebody comes,
oh, there's a certain part of the brain
that's still different, they don't, you know,
there's no hippocampus in the monkey brain,
it's only in the human brain.
Huxley had to do a surgery in front of many, many people
in the late 19th century to show to them
there's actually a hippocampus in the chimpanzees' brain,
you know?
So their continuity of the species
is another element that, you know, contributes
to this sort of, you know, idea
that we are ourselves a total product of nature.
And that, to me, is the magic and the mystery how.
How nature could actually, you know,
give rise to organisms that have the capabilities
that we have.
So it's interesting because even the idea of evolution
is hard for me to keep all together in my mind.
So because we think of a human timescale,
it's hard to imagine that like the development
of the human eye would give me nightmares too,
because you have to think across many, many, many
generations, and it's very tempting to think about
kind of a growth of a complicated object,
and it's like, how is it possible for such a thing
to be built?
Because also me, from a robotics engineering perspective,
it's very hard to build these systems.
How can, through an undirected process,
can a complex thing be designed?
It seems not, it seems wrong.
Yeah, so that's absolutely right.
And I, you know, a slightly different career path
that would have been equally interesting to me
would have been to actually study the process
of embryological development flowing on
into brain development and the exquisite sort of
laying down of pathways and so on that occurs in the brain.
And I know the slightest bit about that, it's not my field,
but there are, you know, fascinating aspects
to this process that eventually result in the complexity
of various brains, at least, you know, one thing we're,
in the field, I think people have felt for a long time,
in the study of vision, the continuity between humans
and non-human animals has been second nature
for a lot longer.
I was having, I had this conversation with somebody
who's a vision scientist and you saying,
oh, we don't have any problem with this, you know,
the monkey's visual system and the human visual system,
extremely similar, up to certain levels, of course,
they diverge after a while.
But the first, the visual pathway from the eye to the brain
and the first few layers of cortex or cortical areas,
I guess, one would say, are extremely similar.
Yeah, so on the cognition side is where the leap
seems to happen with humans,
that it does seem to work kind of special.
And that's a really interesting question
when thinking about alien life
or if there's other intelligent alien civilizations out there,
it's how special is this leap?
So one special thing seems to be the origin of life itself,
however you define that, there's a gray area.
And the other leap, this is very biased perspective
of a human is the origin of intelligence.
And again, from an engineer perspective,
it's a difficult question to ask, an important one,
is how difficult does that leap?
How special were humans?
Did a monolith come down?
Did aliens bring down a monolith?
And some apes had to touch a monolith,
but to get it to get-
It's a lot like Descartes' idea, right?
Exactly, but it just seems one heck of a leap
to get to this level of intelligence.
Yeah, and so Chomsky argued that some genetic fluke occurred,
100,000 years ago, and just happened that some human,
some hominin predecessor of current humans
had this one genetic tweak that resulted in language.
And language, then-predestined,
was the only way to get to that level of intelligence.
Language, and language then provided
this special thing that separates us from all other animals.
I think there's a lot of truth
to the value and importance of language,
but I think it comes along with the evolution
of a lot of other related things related to sociality
and mutual engagement with others
and establishment of, I don't know,
rich mechanisms for organizing an understanding of the world
which language then plugs into.
Right, so language is a tool that allows you
to do this kind of collective intelligence,
and whatever is at the core of the thing
that allows for this collective intelligence
is the main thing.
And it's interesting to think about
that one fluke, one mutation,
could lead to the first crack opening of the door
to human intelligence.
Like all it takes is one.
Like evolution just kind of opens the door a little bit,
and then time and selection takes care of the rest.
You know, there's so many fascinating aspects
to these kinds of things.
So we think of evolution as continuous, right?
We think, oh yes, okay, over 500 million years
there could have been this relatively continuous changes.
And but that's not what anthropologists,
evolutionary biologists found from the fossil record.
They found hundreds of millions of years of stasis.
And then suddenly a change occurs.
Well, suddenly on that scale is a million years or something,
but or even 10 million years,
but the concept of punctuated equilibrium
was a very important concept in evolutionary biology.
And that also feels somehow right about,
you know, the stages of our mental abilities.
We seem to have a certain kind of mindset at a certain age.
And then at another age, we like look at that four-year-old
and say, oh my God, how could they have thought that way?
So Piaget was known for this kind of stage theory
of child development, right?
And you look at it closely
and suddenly those stages are so discreet and transitions.
But the difference between the four-year-old
and the seven-year-old is profound.
And that's another thing that's always interested me
is how we, something happens over the course
of several years of experience
where at some point we reach the point
where something like an insight or a transition
or a new stage of development occurs.
And, you know, these kinds of things can be understood
in complex systems research.
And so evolutionary biology, developmental biology,
cognitive development are all things
that have been approached in this kind of way.
Yeah, just like you said,
I find both fascinating those early years of human life
but also the early like minutes, days
of the embryonic development to like how from embryos
you get like the brain, that development.
Again, from the engineer perspective, it's fascinating.
So it's not, so the early, when you deploy the brain
to the human world and it gets to explore that world
and learn, that's fascinating.
But just like the assembly of the mechanism
that is capable of learning, that's like amazing.
The stuff they're doing with like brain organoids
where you can build many brains and study that
self-assembly of a mechanism from like the DNA material.
That's like, what the heck?
You have literally like biological programs
that just generate a system, this mushy thing
that's able to be robust and learn
in a very unpredictable world
and learn seemingly arbitrary things
or like a very large number of things
that enable survival.
Yeah, ultimately that is a very important part
of the whole process of understanding this sort of
emergence of mind from brain kind of thing.
And the whole thing seems to be pretty continuous.
So let me step back to neural networks
for another brief minute.
You wrote parallel distributed processing books
that explored ideas of neural networks
in the 1980s together with a few folks.
But the books you wrote with David Ramelhardt,
who is the first author on the back propagation paper,
which you have Hinton.
So these are just some figures at the time
that we're thinking about these big ideas.
What are some memorable moments of discovery
and beautiful ideas from those early days?
I'm gonna start sort of with my own process
in the mid-70s and then into the late 70s
when I met Jeff Hinton and he came to San Diego.
And we were all together in my time in graduate schools,
I've already described to you.
I had this sort of feeling of,
okay, I'm really interested in human cognition,
but this disembodied sort of way of thinking about it
that I'm getting from the current mode of thought about it
is isn't working fully for me.
And when I got my assistant professorship,
I went to UCSD and that was in 1974.
Something amazing had just happened.
Dave Ramelhardt had written a book together
with another man named Don Norman.
And the book was called Explorations in Cognition.
And it was a series of chapters
exploring interesting questions about cognition,
but in a completely sort of abstract,
non-biological kind of way.
And I think, gee, this is amazing.
I'm coming to this community where people can get together
and feel like they've collectively exploring ideas.
And it was a book that had a lot of,
I don't know, lightness to it.
And Don Norman, who was the more senior figure
to Ramelhardt at that time who led that project,
always created this spirit of playful exploration of ideas.
And so I'm like, wow, this is great.
But I was also still trying to get from the neurons
to the cognition.
And I realized at one point,
I got this opportunity to go to a conference
where I heard a talk by a man named James Anderson,
who was an engineer, but by then a professor
in a psychology department who had used linear algebra
to create neural network models
of perception and categorization and memory.
And I just blew me out of the water
that one could create a model
that was simulating neurons,
not just kind of engaged in a stepwise algorithmic process
that was construed abstractly.
But it was simulating remembering and recalling
and recognizing the prior occurrence of a stimulus
or something like that.
So for me, this was a bridge between the mind and the brain.
And I remember I was walking cross campus one day in 1977
and I almost felt like St. Paul on the road to Damascus.
I said to myself, you know, if I think about the mind
in terms of a neural network,
it will help me answer the questions about the mind
that I'm trying to answer.
And that really excited me.
So I think that a lot of people
were becoming excited about that.
And one of those people was Jim Anderson,
who I had mentioned.
Another one was Steve Grossberg,
who had been writing about neural networks
since the 60s and Jeff Hinton was yet another.
And his PhD dissertation showed up in an applicant pool
to a postdoctoral training program
that Dave and Don, the two men I mentioned before,
Romelhardt and Norman were administering.
And Romelhardt got really excited
about Hinton's PhD dissertation.
And so Hinton was one of the first people
who came and joined this group of postdoctoral scholars
that was funded by this wonderful grant that they got.
Another one who is also well-known
in neural network circles is Paul Smolensky.
He was another one of that group.
Anyway, Jeff and Jim Anderson organized a conference
at UCSD where we were.
And it was called Parallel Models of Associative Memory
and it brought all the people together
who had been thinking about these kinds of ideas
in 1979 or 1980.
And this began to kind of
really resonate with some of Romelhardt's own thinking,
some of his reasons for wanting something other than
the kinds of computation he'd been doing so far.
So let me talk about Romelhardt now for a minute.
Okay, with that context.
Well, let me also just pause
because you said so many interesting things
before we go to Romelhardt.
So first of all, for people who are not familiar,
neural networks are at the core of the machine learning,
deep learning revolution of today,
Geoffrey Hinton that we mentioned is one of the figures
that were important in the history,
like yourself in the development of these neural networks,
artificial neural networks that are then used
for the machine learning application.
Like I mentioned, the back propagation paper
is one of the optimization mechanisms
by which these networks can learn.
And the word parallel is really interesting.
So it's almost like synonymous
from a computational perspective
of how you thought at the time about neural networks,
that it's parallel computation.
Is that, would that be fair to say?
Well, yeah, the word parallel in this
comes from the idea that each neuron
is an independent computational unit, right?
It gathers data from other neurons,
it integrates it in a certain way
and then it produces a result.
And it's a very simple little computational unit,
but it's autonomous in the sense that it does its thing, right?
It's in a biological medium
where it's getting nutrients and various chemicals
from that medium, but you can think of it
as almost like a little computer in and of itself.
So the idea is that each, our brains have,
oh, look, 100 or hundreds,
almost a billion of these little neurons, right?
And they're all capable of doing their work at the same time.
So it's like, instead of just a single central processor
that's engaged in chug one step after another,
we have a billion of these little computational units
working at the same time.
So at the time that's, I don't know, maybe you can comment,
it seems to me, even still to me,
quite a revolutionary way to think about computation
relative to the development of theoretical computer science
alongside of that, where it's very much like sequential
computer, you're analyzing algorithms
that are running on a single computer,
you're saying, wait a minute,
why don't we take a really dumb, very simple computer
and just have a lot of them interconnected together?
And they're all operating in their own little world
and they're communicating with each other
and thinking of computation in that way.
And from that kind of computation,
trying to understand how things like certain characteristics
of the human mind can emerge.
That's quite a revolutionary way of thinking, I would say.
Well, yes, I agree with you.
And there's still this sort of sense
of not sort of knowing how we kind of get all the way there,
I think, and this very much remains
at the core of the questions that everybody's asking
about the capabilities of deep learning
and all these kinds of things.
But if I could just play this out a little bit,
a convolutional neural network or a CNN,
which many people may have heard of,
is a set of, you could think of it biologically
as a set of collections of neurons.
Each one had, each collection has maybe 10,000 neurons in it,
but there's many layers, right?
Some of these things are hundreds
or even a thousand layers deep,
but others are closer to the biological brain
and maybe they're like 20 layers deep
or something like that.
So we have, within each layer, we have thousands of neurons
or tens of thousands maybe.
Well, in the brain, we probably have millions
in each layer, but we're getting sort of similar
in a certain way, right?
And then we think, okay, at the bottom level,
there's an array of things that are like the photoreceptors
in the eye, they respond to the amount of light
of a certain wavelength at a certain location
on the pixel array.
So that's like the biological eye
and then there's several further stages going up,
layers of these neuron-like units.
And you go from that raw input, array of pixels,
to a classification, you've actually built a system
that could do the same kind of thing that you and I do
when we open our eyes and we look around
and we see there's a cup, there's a cell phone,
there's a water bottle, and these systems
are doing that now, right?
So they are, in terms of the parallel idea
that we were talking about before,
they are doing this massively parallel computation
in the sense that each of the neurons
in each of those layers is thought of as computing
its little bit of something about the input
simultaneously with all the other ones in the same layer.
We get to the point of abstracting that away
and thinking, oh, it's just one whole vector
that's being computed, one activation pattern
that's computed in a single step
and that abstraction is useful,
but it's still that parallel
and distributed processing, right?
Each one of these guys is just contributing
a tiny bit to that whole thing.
And that's the excitement that you felt
that from these simple things you can emerge
when you add these level of abstractions on it,
you can start getting all the beautiful things
that we think about as cognition.
And so, okay, so you have this conference,
I forgot the name already, but it's parallel
and something associated with memory and so on.
Very exciting, technical and exciting title,
and you started talking about Dave Romerhardt,
so who is this person that was so,
you've spoken very highly of him.
Can you tell me about him, his ideas, his mind,
who he was as a human being, as a scientist?
So, Dave came from a little tiny town
in western South Dakota and his mother was the librarian
and his father was the editor of the newspaper.
And I know one of his brothers pretty well.
They grew up, there were four brothers
and they grew up together and their father
encouraged them to compete with each other a lot.
They competed in sports and they competed in mind games.
I don't know, things like Sudoku and chess
and various things like that.
And Dave was a standout undergraduate.
He went at a younger age than most people do to college
at the University of South Dakota
and majored in mathematics.
And I don't know how he got interested in psychology,
but he applied to the mathematical psychology program
at Stanford and was accepted as a PhD student
to study mathematical psychology at Stanford.
So, mathematical psychology is the use of mathematics
to model mental processes, right?
So, something that I think these days
might be called cognitive modeling, that whole space.
Yeah, it's mathematical in the sense that you say,
if this is true and that is true,
then I can derive that this should follow, okay?
And so you say, these are my stipulations
about the fundamental principles
and this is my prediction about behavior.
And it's all done with equations.
It's not done with a computer simulation, right?
So, you solve the equation and that tells you
what the probability that the subject will be correct
on the seventh trial of the experiment is
or something like that, right?
So, it's a use of mathematics
to descriptively characterize aspects of behavior.
And Stanford at that time was the place
where there were several really, really strong
mathematical thinkers who were also connected
with three or four others around the country
who brought a lot of really exciting ideas onto the table.
And it was a very, very prestigious part
of the field of psychology at that time.
So, Rommelhardt comes into this.
He was a very strong student within that program.
And he got this job at this brand new university
in San Diego in 1967, where he's one of the first
assistant professors in the Department of Psychology
at UCSD.
So, I got there in 74, seven years later.
And Rommelhardt at that time was still doing
mathematical modeling, but he had gotten interested
in cognition.
He'd gotten interested in understanding.
And understanding, I think, remains,
what does it mean to understand anyway?
It's an interesting sort of curious,
like how would we know if we really understood something?
But he was interested in building machines
that would hear a couple of sentences
and have an insight about what was going on.
So, for example, one of his favorite things
at that time was,
Margie was sitting on the front step
when she heard the familiar jingle of the good humor man.
She remembered her birthday money and ran into the house.
What is Margie doing?
Why?
Well, there's a couple of ideas you could have,
but the most natural one is that
the good humor man brings ice cream.
She likes ice cream.
She knows she needs money to buy ice cream,
so she's gonna run into the house and get her money
so she can buy herself an ice cream.
It's a huge amount of inference that has to happen
to get those things to link up with each other.
And he was interested in how the hell that could happen.
And he was trying to build,
you know, good old-fashioned AI style models
of representation of language and content of,
you know, things like has money.
So like a lot of like formal logic and like knowledge basis,
like that kind of stuff.
So he was integrating that
with his thinking about cognition.
Yes.
The mechanisms of cognition,
how can they like mechanistically be applied
to build these knowledge,
like to actually build something
that looks like a web of knowledge
and thereby from there emerges something like understanding,
whatever the heck that is.
Yeah, but he was grappling,
this was something that they grappled with
at the end of that book
that I was describing, Explorations in Cognition.
But he was realizing that the paradigm
of good old-fashioned AI wasn't giving him the answers
to these questions.
By the way, that's called good old-fashioned AI now.
It was called that at the time.
Well, it was.
It was beginning to be called that.
Oh, because it was from the 60s.
Yeah, yeah.
By the late 70s, it was kind of old-fashioned
and it hadn't really panned out, you know?
And people were beginning to recognize that.
But, and Rommel Hart was, you know, like, yeah,
he was part of the recognition that this wasn't all working.
Anyway, so he started thinking in terms of the idea
that we needed systems that allowed us
to integrate multiple simultaneous constraints
in a way that would be mutually influencing each other.
So he wrote a paper that just really, first time I read it,
I said, oh, well, you know, yeah, but is this important?
But after a while, it just got under my skin.
And it was called an interactive model of reading.
And in this paper, he laid out the idea
that every aspect of our interpretation of what's coming off the page when we read,
at every level of analysis you can think of,
actually depends on all the other levels of analysis.
So what are the actual pixels making up each letter?
And what do those pixels signify about which letters they are?
And what are those letters tell us about what words are there?
And what are those words tell us about what ideas
the author is trying to convey?
And so he had this model where we have these little tiny elements
that represent each of the pixels of each of the letters
and then other ones that represent the line segments in them
and other ones that represent the letters
and other ones that represent the words.
And at that time, his idea was there's this set of experts.
There's an expert about how to construct a line out of pixels
and another expert about how which sets of lines
go together to make which letters
and another one about which letters go together to make bench words
and another one about what the meanings of the words are
and another one about how the meanings fit together
and you know, things like that.
And all these experts are looking at this data
and they're updating hypotheses at other levels.
So the word expert can tell the letter expert,
oh, I think there should be a T there
because I think there should be a word the here
and the bottom up sort of feature to letter expert
could say, I think there should be a T there too.
And if they agree, then you see a T, right?
And so there's a top down bottom up interactive process
but it's going on at all layers simultaneously.
So everything can filter all the way down from the top
as well as all the way up from the bottom.
And it's a completely interactive,
bi-directional, parallel distributed process.
That is somehow because of the abstractions is hierarchical.
So like, so there's different layers of responsibilities,
different levels of responsibilities.
First of all, it's fascinating to think about it
in this kind of mechanistic way.
So not thinking purely from the structure
of your own network or something like in your own network
but thinking about these little guys that work on letters
and then the letters come words and words become sentences.
And that's a very interesting hypothesis
that from that kind of hierarchical structure
can emerge understanding.
Yeah, so, but the thing is though,
I wanna just sort of relate this
to earlier part of the conversation.
When Ronald Hart was first thinking about it,
there were these experts on the side,
one for the features and one for the letters
and one for how the letters make the words and so on.
And they would each be working
sort of evaluating various propositions about,
is this combination of features here going to be one
that looks like the letter T and so on.
And what he realized kind of after reading
Hinton's dissertation and hearing about Jim Anderson's
linear algebra based neural network models
that I was telling you about before
was that he could replace those experts
with neuron like processing units
which just would have their connection weights
that would do this job.
So what ended up happening was that Ronald Hart
and I got together and we created a model
called the Interactive Activation Model of Letter Perception
which takes these little pixel level inputs,
constructs line segment features, letters and words.
But now we built it out of a set of neuron like processing
units that are just connected to each other
with connection weights.
So the unit for the word time has a connection
to the unit for the letter T in the first position
and the letter I in the second position, so on.
And because these connections are bidirectional,
if you have prior knowledge that it might be the word time
that starts to prime the letters and the features
and if you don't, then it has to start bottom up
but the directionality just depends
on where the information comes in first.
And if you have context together with features
at the same time, they can convergently result
in an emergent perception.
And that was the piece of work that we did together
that sort of got us both completely convinced
that this neural network way of thinking
was going to be able to actually address the questions
that we were interested in as cognitive cycles.
So the algorithmic side, the optimization side,
those are all details like when you first start
the idea that you can get far
with this kind of way of thinking,
that in itself is a profound idea.
So do you like the term connectionism
to describe this kind of set of ideas?
I think it's useful.
It highlights the notion that the knowledge
that the system exploits is in the connections
between the units, right?
There isn't a separate dictionary,
there's just the connections between the units.
So I already sort of laid that on the table
with the connections from the letter units
to the unit for the word time, right?
The unit for the word time isn't a unit for the word time
for any other reason than it's got the connections
to the letters that make up the word time.
Those are the units on the input that excited
when it's excited that it in a sense represents
in the system that there's support for the hypothesis
that the word time is present in the input.
But it's not, the word time isn't written anywhere
inside the bottle, it's only written there.
In the picture we drew of the model to say
that's the unit for the word time, right?
And if somebody wants to tell me,
well, how do you spell that word?
You have to use the connections from that out
to then get those letters, for example.
That's such a counterintuitive idea
where humans want to think in this logic way.
This idea of connectionism, it doesn't, it's weird.
It's weird that this is how it all works.
Yeah, but let's go back to that CNN, right?
That CNN with all those layers of neuron-like processing
units that we were talking about before,
it's gonna come out and say, this is a cat, that's a dog.
But it has no idea why it said that.
It's just got all these connections
between all these layers of neurons,
from the very first layer to the,
whatever these layers are,
they just get numbered after a while
because they somehow further in you go,
the more abstract the features are,
but it's a graded and continuous sort of process
of abstraction anyway.
And it goes from very local, very specific
to much more sort of global,
but it's still another sort of pattern of activation
over an array of units.
And then at the output side, it says it's cat or it's a dog.
And when I open my eyes and say, oh, that's Lex,
or oh, there's my own dog and I recognize my dog,
which is a member of the same species as many other dogs,
but I know this one because of some slightly unique
characteristics, I don't know how to describe
what it is that makes me know that I'm looking at Lex
or at my particular dog, right?
Or even that I'm looking at a particular brand of car.
Like I can say a few words about it,
but I wrote you a paragraph about the car.
You would have trouble figuring out
which cars he talking about, right?
So the idea that we have propositional knowledge
of what it is that allows us to recognize
that this is an actual instance
of this particular natural kind
has always been something that never worked, right?
You couldn't ever write down
instead of propositions for visual recognition.
And so in that space, it's sort of always seemed very natural
that something more implicit,
you don't have access to what the details
of the computation were in between,
you just get the result.
So that's the other part of connectionism.
You cannot, you don't read the contents of the connections.
The connections only cause outputs
to occur based on inputs.
Yeah, and for us that like final layer
or some particular layer is very important.
The one that tells us that it's our dog
or like it's a cat or a dog,
but each layer is probably equally as important
in the grand scheme of things.
Like there's no reason why the cat versus dog
is more important than the lower level activations.
It doesn't really matter.
I mean, all of it is just this beautiful stacking
on top of each other.
And we humans live in this particular layers for us.
For us, it's useful to survive,
to use those cat versus dog, predator versus prey,
all those kinds of things.
It's fascinating that it's all continuous.
But then you then ask,
the history of artificial intelligence, you ask,
are we able to introspect and convert the very things
that allow us to tell the difference to cat and dog
into logic, into form of logic.
That's been the dream.
I would say that's still part of the dream of symbolic AI
and I've recently talked to Doug Lennett
who created Psych.
And that's a project that lasted for many decades
and still carries a sort of dream in it, right?
We still don't know the answer, right?
It seems like connectionism is really powerful,
but it also seems like there's this building of knowledge.
And so how do we, how do you square those two?
Like, do you think the connections can contain
the depth of human knowledge
and the depth of what Dave Rommelhart
was thinking about of understanding?
Well, that remains the $64 question and I...
With inflation that number is higher.
Okay, $64 is the house of dollars.
Maybe it's a $64 billion question now.
I think that from the emergentist side,
which I place my hands on,
myself on, so I used to sometimes tell people
I was a radical, eliminative connectionist
because I didn't want them to think
that I wanted to build like anything into the machine,
but I don't like the word eliminative anymore
because it makes it seem like
it's wrong to think that there is this emergent level
of understanding and I disagree with that.
So I think, I would call myself
in a radical emergentist connectionist
rather than a eliminative connectionist, right?
Because I want to acknowledge
that these higher level kinds of aspects
of our cognition are real, but they don't exist as such.
And there was an example that Doug Hofstetter used to use
that I thought was helpful in this respect,
just the idea that we could think about sand dunes as entities
and talk about like how many there are even,
but we also know that a sand dune is a very fluid thing.
It's a pile of sand that is capable
of moving around under the wind
and reforming itself in somewhat different ways.
And if we think about our thoughts as like sand dunes
as being things that emerge from
just the way all the lower level elements
sort of work together and are constrained by external forces,
then we can say, yes, they exist as such,
but they also, we shouldn't treat them
as completely monolithic entities
that we can understand without understanding
sort of all of the stuff that allows them
to change in the ways that they do.
And that's where I think the connectionist feeds
into the cognitive, it's like, okay,
so if the substrate is parallel distributed,
connectionist, then it doesn't mean
that the contents of thought isn't like abstract
and symbolic, but it's more fluid maybe
than it's easier to capture with a set of logical expressions.
Yeah, that's a heck of a sort of thing
to put at the top of a resume,
radical, emergentist, connectionist.
So there is, just like you said, a beautiful dance between that,
between the machinery of intelligence,
like the neural network side of it,
and the stuff that emerges.
I mean, the stuff that emerges seems to be,
I don't know, I don't know what that is,
that it seems like maybe all of reality is emergent.
What I think about, this is made most distinctly rich
to me when I look at cellular phenomena,
look at game of life, that from very, very simple things,
very rich, complex things emerge
that start looking very quickly like organisms,
that you forget how the actual thing operates.
They start looking like they're moving around,
they're eating each other,
some of them are generating offspring,
you forget very quickly.
And it seems like maybe it's something about the human mind
that wants to operate in some layer of the emergent
and forget about the mechanism of how that emergent happens.
So it just like you are in your radicalness.
Also, it seems like unfair to eliminate the magic
of that emergent, like eliminate the fact
that that emergent is real.
Yeah, no, I agree.
I'm not, that's why I got rid of eliminative, right?
Eliminative, yeah.
Yeah, because it seemed like that was trying to say
that it's all completely like.
An illusion of some kind, it's not.
Well, who knows whether there isn't,
there aren't some illusory characteristics there.
And I think that philosophically,
many people have confronted that possibility over time,
but it's still important to accept it as magic, right?
So, I think of Fellini and this context,
I think of others who have appreciated
the role of magic, of actual trickery in creating illusions
that move us, you know, and Plato was on to this too.
It's like somehow or other, these shadows,
give rise to something much deeper than that.
And that's, so we won't try to figure out what it is,
we'll just accept it as given that that occurs.
And, but he was still onto the magic of it.
Yeah, yeah, we won't try to really, really,
really deeply understand how it works.
We'll just enjoy the fact that it's kind of fun.
Okay, but you worked closely with Dave over on my heart.
He passed away as a human being.
What do you remember about him?
Do you miss the guy?
Absolutely, you know, he passed away 15-ish years ago now, and his demise was actually one of the most poignant and, you know, like relevant tragedies.
It's relevant to our conversation.
He started to undergo a progressive neurological condition that isn't fully understood.
That is to say, his particular course isn't fully understood because certain, you know,
brain scans weren't done at certain stages and no autopsy was done or anything like that.
The wishes of the family.
So we don't know as much about the underlying pathology as we might.
But I had begun to get interested in this neurological condition that might have been the very one that he was succumbing to as my own efforts
to understand another aspect of this mystery that we've been discussing while he was beginning to get progressively more and more affected.
So I'm going to talk about the disorder and not about Rommelhardt for a second.
Okay, the disorder is something my colleagues and collaborators have chosen to call semantic dementia.
So it's a specific form of loss of mind related to meaning, semantic dementia.
And it's progressive in the sense that the patient loses the ability to appreciate the meaning of the experiences that they have,
either from touch, from sight, from sound, from language, they, I hear sounds, but I don't know what they mean kind of thing.
So as this illness progresses, it starts with the patient being unable to differentiate like similar breeds of dog
or remember the lower frequency unfamiliar categories that they used to be able to remember.
But as it progresses, it becomes more and more striking and the patient loses the ability to recognize things like pigs and goats and sheep
and calls all middle-sized animals dogs and all can't recognize rabbits and rodents anymore.
They call all the little ones cats and they can't recognize hippopotamuses and cows anymore.
They call them all horses, you know.
So there was this one patient who went through this progression where at a certain point, any four-legged animal,
he would call it either a horse or a dog or a cat.
And if it was big, he would tend to call it a horse.
If it was small, he'd tend to call it a cat.
Middle-sized ones, he called dogs.
This is just a part of the syndrome, though.
The patient loses the ability to relate concepts to each other.
So my collaborator in this work, Carolyn Patterson, developed a test called the Pyramids and Palm Trees Test.
So you give the patient a picture of pyramids and they have a choice, which goes with the pyramids?
Palm trees or pine trees?
And, you know, she showed that this wasn't just a matter of language because the patient's
loss of disability shows up whether you present the material with words or with pictures.
The pictures, they can't put the pictures together with each other properly anymore.
They can't relate the pictures to the words either.
They can't do word picture matching, but they've lost the conceptual grounding from either modality of input.
And so that's why it's called semantic dementia.
The very semantics is disintegrating.
And we understand this in terms of our idea that distributed representation, a pattern
of activation represents the concepts, really similar ones.
As you degrade them, they start being, you lose the differences.
And then, so the difference between the dog and the goat sort of is no longer part of
the pattern anymore.
And since dog is really familiar, that's the thing that remains.
And we understand that in the way the models work and learn.
But Rommelhardt underwent this condition.
So on the one hand, it's a fascinating aspect of parallel distributed processing to be.
And it reveals this sort of texture of distributed representation in a very nice way, I've always
felt.
But at the same time, it was extremely poignant because this is exactly the condition that
Rommelhardt was undergoing.
And there was a period of time when he was, this man who had been the most focused, goal-directed,
competitive, thoughtful person who was willing to work for years to solve a hard problem,
you know, he starts to disappear.
And there was a period of time when it was like hard for any of us to really appreciate
that he was sort of, in some sense, not fully there anymore.
Do you know if he was able to introspect this, the solution of this, you know, the understanding
mind?
Was he, I mean, this is one of the big scientists that thinks about this.
Was he able to look at himself and understand the fading mind?
You know, we can contrast Hawking and Rommelhardt in this way.
And I like to do that to honor Rommelhardt because I think Rommelhardt is sort of like
the Hawking of, you know, cognitive science to me in some ways.
Both of them suffered from a degenerative condition.
In Hawking's case, it affected the motor system.
And in Rommelhardt's case, it's affecting the semantics and not just the pure object semantics,
but maybe the self semantics as well.
And we don't understand that.
Concepts broadly.
So, I would say he didn't, and this was part of what from the outside was a profound tragedy.
But on the other hand, at some level, he sort of did because, you know, there was a period
of time when he finally was realized that he had really become profoundly impaired.
This was clearly a biological condition and he wasn't, you know, it wasn't just like he
was distracted that day or something like that.
So he retired, you know, from his professorship at Stanford and he became, he lived with his
brother for a couple of years and then he moved into a facility for people with cognitive
impairments, one that, you know, many elderly people end up in when they have cognitive
impairments.
And I would spend time with him during that period.
This was like in the late 90s, around 2000 even.
And you know, we would go bowling and he could still bowl.
And after bowling, I took him to lunch and I said, where would you like to go?
You want to go to Wendy's and he said, nah, and I said, okay, well, where do you want
to go?
And he just pointed.
He said, turn here, you know.
So he still had a certain amount of spatial cognition and he could get me to the restaurant.
And then when we got to the restaurant, I said, what do you want to order?
And he couldn't come up with any of the words, but he knew where on the menu the thing was
that he wanted.
So it's, you know, and he couldn't say what it was, but he knew that that's what he wanted
to eat.
And so, you know, it's like it isn't monolithic at all.
Our cognition is, you know, first of all, graded in certain kinds of ways, but also multi-partite.
There's many elements to it and things, certain sort of partial competencies still exist in
the absence of other aspects of these competencies.
So this is what always fascinated me about what used to be called cognitive neuropsychology,
you know, the effects of brain damage on cognition, but in particular this gradual disintegration
part.
I'm a big believer that the loss of a human being that you value is as powerful as, you
know, first falling in love with that human being.
I think it's all a celebration of the human being.
So the disintegration itself too is a celebration in a way.
Yeah.
Yeah.
And, but just to say something more about the scientist and the back propagation idea
that you mentioned.
So in 1982, Hinton had been there as a postdoc and organized that conference.
He'd actually gone away and gotten an assistant professorship, and then there was this opportunity
to bring him back.
So Jeff Hinton was back on a sabbatical.
San Diego.
In San Diego.
And Rommelhardt and I had decided we wanted to do this, you know, we thought it was really
exciting and are the papers on the interactive activation model that I was telling you about
had just been published.
And we both sort of saw huge potential for this work and Jeff was there.
And so the three of us started a research group, which we called the PDP Research Group.
And several other people came, Francis Crick, who was at the Salk Institute, heard about
it from Jeff.
And because Jeff was known among Brits to be brilliant and Francis was well connected
with his British friends.
So Francis Crick came and a heck of a group of people.
Wow.
Okay.
And several, as Paul Spolensky was one of the other postdocs, he was still there as
a postdoc and a few other people.
But anyway, Jeff talked to us about learning and how we should think about how, you know,
learning occurs in a neural network.
And he said, the problem with the way you guys have been approaching this is that you've
been looking for inspiration from biology to tell you how, what the rule should be for
how the synapses should change the strengths of their connections, how the connections
should form.
He said, that's the wrong way to go about it.
What you should do is you should think in terms of how you can adjust connection weights
to solve a problem.
So you define your problem and then you figure out how the adjustment of the connection weights
will solve the problem.
And Rommelhart heard that and said to himself, okay, so I'm going to start thinking about
it that way.
I'm going to essentially imagine that I have some objective function, some goal of the
computation.
I want my machine to correctly classify all of these images and I can score that.
I can measure how well they're doing on each image and I get some measure of error or loss
it's typically called in deep learning.
And I'm going to figure out how to adjust the connection weights so as to minimize my
loss or reduce the error.
And that's called, you know, gradient descent.
And engineers were already familiar with the concept of gradient descent.
And in fact, there was an algorithm called the Delta Rule that had been invented by a
professor in the electrical engineering department at Stanford, Bernie Widrow and a collaborator
named Hoff.
I don't never met him.
Anyway, so gradient descent in continuous neural networks with multiple neuron-like
processing units was already understood for a single layer of connection weights.
We have some inputs over a set of neurons.
We want the output to produce a certain pattern.
We can define the difference between our target and what the neural network is producing
and we can figure out how to change the connection weights to reduce that error.
So what Meromohart did was to generalize that so as to be able to change the connections
from earlier layers of units to the ones at a hidden layer between the input and the
output.
And so he first called the algorithm the generalized Delta Rule because it's just an extension
of the gradient descent idea.
And interestingly enough, Hinton was thinking that this wasn't going to work very well.
So Hinton had his own alternative algorithm at the time based on the concept of the Bolsa
Machine that he was pursuing.
So the paper on the learning in Bolsa Machines came out in 1985, but it turned out that back
prop worked better than the Bolsa Machine learning algorithm.
So this generalized Delta algorithm ended up being called back propagation, as you say,
back prop.
Yeah.
And probably that name is opaque to me, but what does that mean?
What it meant was that in order to figure out what the changes you needed to make to
the connections from the input to the hidden layer, you had to back propagate the error
signals from the output layer through the connections from the hidden layer to the output
to get the signals that would be the error signals for the hidden layer.
And that's how Meromohart formulated it.
It was like, well, we know what the error signals are at the output layer.
Let's see if we can get a signal at the hidden layer that tells each hidden unit what its
error signal is essentially.
So it's back propagating through the connections from the hidden to the output to get the signals
to tell the hidden units how to change their weights from the input, and that's why it's
called back prop.
But so it came from Hinton having introduced the concept of define your objective function,
figure out how to take the derivative so that you can adjust the connections so that they
make progress towards your goal.
So stop thinking about biology for a second, and let's start to think about optimization
and computation a little bit more.
So what about Jeff Hinton?
But you've gotten a chance to work with him in that little...
The set of people involved there is quite incredible.
The small set of people under the PDP flag, it's just given the amount of impact those
ideas have had over the years, it's kind of incredible to think about.
But, you know, just like you said, like yourself, Jeff Hinton is seen as one of the, not just
like a seminal figure in AI, but just a brilliant person, just like the horsepower of the mind
is pretty high up there for him, because he's just a great thinker.
So what kind of ideas have you learned from him?
Have you influenced each other on?
Have you debated over what stands out to you in the full space of ideas here at the intersection
of computation and cognition?
Well, so Jeff has said many things to me that had a profound impact on my thinking.
And he's written several articles which were way ahead of their time.
He had two papers in 1981, just to give one example.
One of which was essentially the idea of Transformers, and another of which was a early paper on
semantic cognition, which inspired him and Rommelhart and me throughout the 80s.
And, you know, still, I think sort of grounds my own thinking about the semantic aspects
of cognition.
He also, in a small paper that was never published that he wrote in 1977, you know, before he
actually arrived at UCSD, or maybe a couple of years even before that, I don't know, when
he was a PhD student, he described how a neural network could do recursive computation.
And it was a very clever idea that he's continued to explore over time, which was sort of the
idea that when you call a subroutine, you need to save the state that you had when you
called it so you can get back to where you were when you're finished with the subroutine.
And the idea was that you would save the state of the calling routine by making fast changes
to connection weights.
And then when you finished with the subroutine call, those fast changes in the connection
weights would allow you to go back to where you had been before and reinstate the previous
context so that you could continue on with the top level of the computation.
Anyway, that was part of the idea.
And I always thought, okay, that's really, you know, he just, he had extremely creative
ideas that were quite a lot ahead of his time and many of them in the 1970s and early 1980s.
So another thing about Jeff Hinton's way of thinking, which has profoundly influenced
my effort to understand human mathematical cognition, is that he doesn't write too many
equations.
And people tell stories like, oh, in the Hinton lab meetings, you don't get up at the board
and write equations like you do in everybody else's machine learning lab.
What you do is you draw a picture.
And he explains aspects of the way deep learning works by putting his hands together and showing
you the shape of a ravine and using that as a geometrical metaphor for what's happening
as this gradient descent process, you're coming down the wall of a ravine.
If you take too big a jump, you're going to jump to the other side.
And so that's why we have to turn down the learning rate, for example.
And it speaks to me of the fundamentally intuitive character of deep insight together with a commitment
to really understanding in a way that's absolutely ultimately explicit and clear.
But also intuitive.
Yeah.
There's certain people like that.
He's an example, some kind of weird mix of visual and intuitive and all those kinds
of things.
Feynman is another example, different style of thinking, but very unique.
And when you're around those people, for me in the engineering realm, there's a guy named
Jim Keller, who's a chip designer, engineer.
Every time I talk to him, it doesn't matter what we're talking about.
Just having experienced that unique way of thinking transforms you and makes your work
much better.
And that's the magic you look at Daniel Kahneman.
You look at the great collaborations throughout the history of science.
That's the magic of that.
It's not always the exact ideas that you talk about, but it's the process of generating
those ideas, being around that, spending time with that human being.
You can come up with some brilliant work, especially when it's cross-disciplinary as
it was a little bit in your case with Jeff.
Yeah.
Jeff is a descendant of the logician Boole.
He comes from a long line of English academics.
And together with the deeply intuitive thinking ability that he has, it's been clear.
He's described this to me, and I think he's mentioned it from time to time in other interviews
that he's had with people.
He's wanted to be able to sort of think of himself as contributing to the understanding
of reasoning itself, not just human reasoning, like Boole is about logic, right?
It's about what can we conclude from what else and how do we formalize that?
But as a computer scientist, logician, philosopher, the goal is to understand how we derive truths
from other, from givens and things like this.
And the work that Jeff was doing in the early to mid-'80s on something called the Bolson
Machine was his way of connecting with that Boolean tradition and bringing it into the
more continuous probabilistic graded constraint satisfaction realm.
And it was beautiful, a set of ideas linked with theoretical physics as well as with logic.
And it's always been, I mean, I've always been inspired by the Bolson Machine, too.
It's like, well, if the neurons are probabilistic rather than deterministic in their computations,
then maybe this somehow is part of the serendipity or adventitiousness of the moment of insight.
It might not have occurred at that particular instant.
It might be sort of partially the result of a stochastic process.
And that, too, is part of the magic of the emergence of some of these things.
Well, you're right with the Boolean lineage and the dream of computer science is somehow,
I mean, I certainly think of humans this way, that humans are one particular manifestation
of intelligence, that there's something bigger going on, and you're hoping to figure that
out.
The mechanisms of intelligence, the mechanisms of cognition are much bigger than just humans.
So I think of, I started using the phrase computational intelligence at some point as
to characterize the field that I thought people like Jeff Hinton and many of the people I
know at DeepMind are working in and where I feel like I'm kind of a human-oriented computational
intelligence researcher in that I'm actually kind of interested in the human solution.
But at the same time, I feel like that's where a huge amount of the excitement of deep learning
actually lies is in the idea that we may be able to even go beyond what we can achieve
with our own nervous systems when we build computational intelligence as that are not
limited in the ways that we are by our own biology.
Perhaps allowing us to scale the very mechanisms of human intelligence just increases power
through scale.
Yes.
And I think that that, obviously, that's being played out massively at Google Brain, at Open
AI, and to some extent at DeepMind as well, I guess I shouldn't say to some extent, the
massive scale of the computations that are used to succeed at games like Go or to solve
the protein folding problems that they've been solving and so on.
Still not as many synapses and neurons as the human brain, so we're still beating them
on that.
We humans are beating the AIs, but they're catching up pretty quickly.
You write about modeling of mathematical cognition, so let me first ask about mathematics in general.
There's a paper titled Parallel Distributed Processing Approach to Mathematical Cognition
where in the introduction there's some beautiful discussion of mathematics, and you reference
there Tristan Needham, who criticizes a narrow form of view of mathematics by liking the
studying of mathematics as symbol manipulation to studying music without ever hearing a note.
So from that perspective, what do you think is mathematics?
What is this world of mathematics like?
Well, I think of mathematics as a set of tools for exploring idealized worlds that often
turn out to be extremely relevant to the real world, but need not, but they're worlds in
which objects exist with idealized properties and in which the relationships among them
can be characterized with precision so as to allow the implications of certain facts
to then allow you to derive other facts with certainty.
So if you have two triangles and you know that there is an angle in the first one that
has the same measure as an angle in the second one, and you know that the lengths of the
sides adjacent to that angle in each of the two triangles, the corresponding sides adjacent
to that angle are also have the same measure, then you can then conclude that the triangles
are congruent, that is to say they have all of their properties in common.
And that is something about triangles, it's not a matter of formulas.
These are idealized objects.
In fact, you know, we built bridges out of triangles and we understand how to measure
the height of something we can't climb by extending these ideas about triangles a little
further, and you know, all of the ability to get a tiny speck of matter launched from
the planet Earth to intersect with some tiny, tiny little body way out in way beyond Pluto
somewhere, at exactly a predicted time and date is something that depends on these ideas,
right?
So, and it's actually happening in the real physical world that these ideas make contact
with it in those kinds of instances.
And so, but you know, there are these idealized objects, these triangles or these distances
or these points, whatever they are, that allow for this set of tools to be created that then
gives human beings the, it's this incredible leverage that they didn't have without these
concepts.
And I think this is actually already true when we think about just, you know, the natural
numbers.
I always like to include zero, so I'm going to say the non-negative integers, but that's
a place where some people prefer not to include zero.
But we like zero here, natural numbers, zero, one, two, three, four, five, six, seven and
so on.
Yeah.
And, and you know, because they give you the ability to be exact about like how many sheep
you have, like, you know, I sent you out this morning, there were 23 sheep, you came back
with only 22.
What happened?
Yeah.
It's a fundamental problem of physics, how many sheep you have.
Yeah.
It's a fundamental problem of life, of human society that you damn well better bring back
the same number of sheep as you started with.
And you know, it allows commerce, it allows contracts, it allows the establishment of
records and so on to have systems that allow these things to be notated.
But they have an inherent aboutness to them, that's one at the, one at the same time sort
of abstract and idealized and generalizable while at the other, on the other hand, potentially
very, very grounded and concrete.
And one of the things that makes for the incredible achievements of the human mind is the fact
that humans invented these idealized systems that leverage the power of human thought in
such a way as to allow all this kind of thing to happen.
And so that's what mathematics to me is the development of systems for thinking about
the properties and relations among sets of idealized objects and you know, the mathematical
notation system that we unfortunately focus way too much on is just our way of expressing
propositions about these properties.
Right.
It's just like we're talking with Chomsky in language, it's the thing we've invented
for the communication of those ideas, they're not necessarily the deep representation of
those ideas.
So what's a good way to model such powerful mathematical reasoning, would you say?
What are some ideas you have for capturing this in a model?
The insights that human mathematicians have had is a combination of the kind of the intuitive
kind of connectionist like knowledge that makes it so that something is just like obviously
true so that you don't have to think about why it's true, that then makes it possible
to then take the next step and ponder and reason and figure out something that you previously
didn't have that intuition about, it then ultimately becomes a part of the intuition
that the next generation of mathematical thinkers have to ground their own thinking on so that
they can extend the ideas even further.
I came across this quotation from Henri Poincaré while I was walking in the woods with my wife
in a state park in Northern California late last summer.
And what it said on the bench was it is by logic that we prove but by intuition that
we discover.
And so what for me the essence of the project is to understand how to bring the intuitive
connectionist resources to bear on letting the intuitive discovery arise from engagement
in thinking with this formal system.
So I think of the ability of somebody like Hinton or Newton or Einstein or Rommelhardt
or Poincaré to Archimedes, this is another example, suddenly a flash of insight occurs.
It's like the constellation of all of these simultaneous constraints that somehow or other
causes the mind to settle into a novel state that it never did before and give rise to
a new idea that then you could say, okay, well now how can I prove this?
How do I write down the steps of that theorem that allow me to make it rigorous and certain?
And so I feel like the kinds of things that we're beginning to see deep learning systems
do of their own accord kind of gives me this feeling of, I don't know, hope or encouragement
that ultimately it'll all happen.
So in particular as many people now have become really interested in thinking about, neural
networks that have been trained with massive amounts of text can be given a prompt and
they can then generate some really interesting, fanciful, creative story from that prompt.
And there's kind of like a sense that they've somehow synthesized something like novel
out of all of the particulars of all of the billions and billions of experiences that
went into the training data that gives rise to something like this sort of intuitive sense
of what would be a fun and interesting little story to tell or something like that.
It just sort of wells up out of the letting the thing play out its own imagining of what
somebody might say given this prompt as an input to get it to start to generate its own
thoughts.
And to me that sort of represents the potential of capturing the intuitive side of this.
And there's other examples.
I don't know if you will find them as captivating as on the deep mind side with AlphaZero.
If you study chess, the kind of solutions that has come up in terms of chess, it is
there's novel ideas there.
It feels very like there's brilliant moments of insight.
And the mechanism they use, if you think of search as maybe more towards good old fashioned
AI and then there's the connection is the network that has the intuition of looking
at a board, looking at a set of patterns and saying how good is this set of positions
and the next few positions, how good are those?
And that's it.
That's just an intuition.
Grandmasters have this and understanding positionally, tactically how good the situation
is, how can it be improved without doing this full like deep search.
And then maybe doing a little bit of what human chess players call calculation, which
is the search.
They're taking a particular set of steps down the line to see how they unroll.
But there is moments of genius in those systems too, so that's another hopeful illustration
that from neural networks can emerge this novel creation of an idea.
Yes.
And I think that, you know, I think Demis Sassavis is, you know, he's spoken about those things.
I heard him describe a move that was made in one of the go matches against Lisa Dahl
in a very similar way and it caused me to become really excited to kind of collaborate
with some of those guys at DeepMind.
So I think though that what I like to really emphasize here is one part of what I like
to emphasize about mathematical cognition at least is that philosophers and logicians
going back three or even a little more than 3,000 years ago began to develop these formal
systems and gradually the whole idea about thinking formally got constructed.
And you know, it's preceded Euclid, certainly present in the work of Thales and others and
I'm not the world's leading expert in all the details of that history.
But Euclid's elements were the kind of the touch point of a coherent document that sort
of laid out this idea of an actual formal system within which these objects were characterized
and the system of inference that allowed new truths to be derived from others was sort
of like established as a paradigm.
And what I find interesting is the idea that the ability to become a person who is capable
of thinking in this abstract formal way is a result of the same kind of immersion in
experience thinking in that way that we now begin to think of our understanding of languages
being.
We immerse ourselves in a particular language, in a particular world of objects and their
relationships and we learn to talk about that and we develop intuitive understanding of
the real world in a similar way.
We can think that what academia has created for us, what you know, those early philosophers
and their academies and Athens and Alexandria and other places allowed was the development
of these schools of thought, modes of thought that then become deeply ingrained and it becomes
what it is that makes it so that somebody like Jerry Fodor would think that systematic
thought is the essential characteristic of the human mind as opposed to a derived and
acquired characteristic that results from acculturation in a certain mode that's been
invented by humans.
Would you say it's more fundamental than like language?
If we start dancing, if we bring Chomsky back into the conversation, first of all, is it
unfair to draw a line between mathematical cognition and language linguistic cognition?
I think that's a very interesting question and I think it's one of the ones that I'm
actually very interested in right now.
I think the answer is in important ways, it is important to draw that line, but then to
come back and look at it again and see some of the subtleties and interesting aspects
of the difference.
If we think about Chomsky himself, he was born into an academic family.
His father was a professor of rabbinical studies at a small rabbinical college in Philadelphia
and he was deeply inculturated in a culture of thought and reason and brought to the effort
to understand natural language this profound engagement with these formal systems.
I think that there was tremendous power in that and that Chomsky had some amazing insights
into the structure of natural language, but that, and I'm going to use the word but there,
the actual intuitive knowledge of these things only goes so far and does not go as far as
it does in people like Chomsky himself.
This was something that was discovered in the PhD dissertation of Lila Gleitman who
was actually trained in the same linguistics department with Chomsky.
What Lila discovered was that the intuitions that linguists had about even the meaning
of a phrase, not just about its grammar but about what they thought a phrase must mean,
were very different from the intuitions of an ordinary person who wasn't a formally trained
thinker.
It recently has become much more salient.
I happen to have learned about this when I myself was a PhD student at the University
of Pennsylvania, but I never knew how to put it together with all of my other thinking
about these things.
I actually currently have the hypothesis that formally trained linguists and other formally
trained academics, whether it be linguistics, philosophy, cognitive science, computer science,
machine learning, mathematics, have a mode of engagement with experience that is intuitively
deeply structured to be more organized around the systematicity and ability to be conformant
with the principles of a system than is actually true of the natural human mind without that
immersion.
That's fascinating, so the different fields and approaches with which you start to study
the mind actually take you away from the natural operation of the mind, so it makes you very
difficult for you to be somebody who introspects.
Yes.
This is where things about human belief and so-called knowledge that we consider private,
not our business to manipulate in others, we are not entitled to tell somebody else
what to believe about certain kinds of things.
What are those beliefs?
Well, they are the product of this immersion and enculturation.
That is what I believe.
That's limiting.
It's something to be aware of.
Does that limit you from having a good model of cognition?
It can.
So when you look at mathematical linguistics, what is that line then?
So is Chomsky unable to sneak up to the full picture of cognition?
Are you, when you're focusing on mathematical thinking, are you also unable to do so?
I think you're right.
I think that's a great way of characterizing it and I also think that it's related to
the concept of beginner's mind and another concept called the expert blind spot.
So the expert blind spot is much more prosaic seeming than this point that you were just
making, but it's something that plagues experts when they try to communicate their understanding
to non-experts and that is that things are self-evident to them that they can't begin
to even think about how they could explain it to somebody else because it's like, well,
it's just so patently obvious that it must be true.
When Kronecker said, God made the natural numbers, all else is the work of man, he was
expressing that intuition that somehow or other, the basic fundamentals of discrete
quantities being countable and innumerable and indefinite in number was not something
that had to be discovered, but he was wrong.
It turns out that many cognitive scientists agreed with him for a time, there was a long
period of time where the natural numbers were considered to be a part of the innate endowment
of core knowledge or to use the phrases that Spelke and Kerry use to talk about what they
believe are the innate primitives of the human mind and they no longer believe that.
They, it's actually been more or less accepted by almost everyone that the natural numbers
are actually a cultural construction and it's so interesting to go back and study those
few people who still exist who don't have those systems.
This is just an example to me and where a certain mode of thinking about language itself
or a certain mode of thinking about geometry and those kinds of relations, so it becomes
so second nature that you don't know what it is that you need to teach.
In fact, we don't really teach it all that explicitly anyway.
You take a math class, the professor teaches it to you the way they understand it.
Some of the students in the class, they get it, they start to get the way of thinking
and they can actually do the problems that get put on the homework that the professor
thinks are interesting and challenging ones, but most of the students who don't engage
just deeply don't ever get and we think, oh, that man must be brilliant.
He must have this special insight, but he must have some biological sort of bit that's
different that makes him so that he or she could have that insight.
Although I don't want to dismiss biological individual differences completely, I find
it much more interesting to think about the possibility that it was that difference in
the dinner table conversation at the Chomsky house when he was growing up that made it
so that he had that cast of mind.
Yeah, and there's a few topics we talked about that kind of interconnect because I wonder
the better I get at certain things, we humans, the deeper we understand something, what are
you starting to then miss about the rest of the world?
We talked about David and his degenerative mind, and when you look in the mirror and
wonder how different am I cognitively from the manner I was a month ago, from the manner
I was a year ago.
If I can, having thought about language of Chomsky for 10, 20 years, what am I no longer
able to see?
What is in my blind spot and how big is that?
And then to somehow be able to leap back out of your deep structure that you form for yourself
about thinking about the world, leap back and look at the big picture again or jump out
of your current way of thinking.
And to be able to introspect, what are the limitations of your mind?
How is your mind less powerful than you used to be or more powerful or different, powerful
in different ways?
That seems to be a difficult thing to do because we're looking at the world through the lens
of our mind, to step outside and introspect is difficult, but it seems necessary if you
want to make progress.
One of the threads of psychological research that's always been very important to me to
be aware of is the idea that our explanations of our own behavior aren't necessarily
part of the causal process that caused that behavior to occur or even valid observations
of the set of constraints that led to the outcome, but they are post hoc rationalizations
that we can give based on information at our disposal about what might have contributed
to the result that we came to when asked.
So this is an idea that was introduced in a very important paper by Nisbet and Wilson
about the limits on our ability to be aware of the factors that cause us to make the choices
that we make and I think it's something that we really ought to be much more cognizant
of in general as human beings is that our own insight into exactly why we hold the beliefs
that we do and we hold the attitudes and make the choices and feel the feelings that we
do is not something that we totally control or totally observe and it's subject to our
culturally transmitted understanding of what it is that is the mode that we give to explain
these things when asked to do so as much as it is about anything else and so even our
ability to introspect and think we have access to our own thoughts is a product of culture
and belief, practice.
So let me ask you the big question of advice.
So you've lived an incredible life in terms of the ideas you've put out into the world
in terms of the trajectory you've taken through your career through your life.
What advice would you give to young people today in high school and college about how
to have a career or how to have a life they can be proud of?
Finding the thing that you are intrinsically motivated to engage with and then celebrating
that discovery is what it's all about.
When I was in college, I struggled with that.
I had thought I wanted to be a psychiatrist because I think I was interested in human
psychology in high school and at that time the only sort of information I had that had
anything to do with the psyche was Freud and Eric Frome and sort of popular psychiatry
kinds of things.
And so, well, they were psychiatrists, right?
So I had to be a psychiatrist and that meant I had to go to medical school and I got to
college and I find myself taking the first semester of a three-quarter physics class
and it was mechanics and this was so far from what it was I was interested in, but it was
also too early in the morning in the winter court semester, so I never made it to the
physics class.
But I wanted about the rest of my freshman year and most of my sophomore year until I
found myself in the midst of this situation where around me there was this big revolution
happening.
I was at Columbia University in 1968 and the Vietnam War is going on.
Columbia is building a gym in Morningside Heights which is part of Harlem and people
are thinking, oh, the big bad rich guys are stealing the parkland that belongs to the people
of Harlem and they're part of the military industrial complex which is enslaving us
and sending us all off to war in Vietnam.
And so there was a big revolution that involved a confluence of black activism and SDS and
social justice and the whole university blew up and got shut down and I got a chance to
sort of think about why people were behaving the way they were in this context.
And I happened to have taken mathematical statistics, I happened to have been taking
psychology that quarter at just psych one and somehow things in that space all ran together
in my mind and got me really excited about asking questions about why people, what made
certain people go into the buildings and not others and things like that.
And so suddenly I had a path forward and I had just been wandering around aimlessly.
And at the different points in my career when I think, okay, well, should I take this class
or should I just read that book about some idea that I want to understand better?
Or should I pursue the thing that excites me and interests me or should I meet some requirement?
I always did the latter.
So I ended up, my professors in psychology thought I was great, they wanted me to go
to graduate school.
They nominated me for Phi Beta Kappa and I went to the Phi Beta Kappa ceremony and this
guy came up and I said, oh, are you Magnar Summa?
I wasn't even getting honors based on my grades, they just happened to have thought I was interested
enough in ideas to belong to Phi Beta Kappa.
So I mean, would it be fair to say you kind of stumbled around a little bit through accidents
of too early morning of classes and physics and so on until you discovered intrinsic motivation,
as you mentioned.
And then that's it.
It hooked you and then you celebrate the fact that this happens to human beings.
And what is it that made what I did intrinsically motivating to me?
Well, that's interesting and I don't know all the answers to it and I don't think anybody
to think that you should be sort of in any way, I don't know, sanctimonious or anything
about it.
It's like I really enjoyed doing statistical analysis of data.
I really enjoyed running my own experiment, which was what I got a chance to do in the
psychology department that chemistry and physics had never, I never imagined that mere mortals
would ever do an experiment in those sciences, except one that was in the textbook that you
were told to do in lab class.
But in psychology, we were already like even when I was taking Psych 1, it turned out we
had our own rat and after two set experiments, we got to do something you think of with your
rat.
So it's the opportunity to do it myself and to bring together a certain set of things
that engaged me intrinsically.
And I think it has something to do with why certain people turn out to be profoundly amazing
musical geniuses.
They get immersed in it at an early enough point and it just sort of gets into the fabric.
So my little brother had intrinsic motivation for music as we witnessed when he discovered
how to put records on the phonograph when he was like 13 months old and recognize which
one he wanted to play, not because he could read the labels, but because he could sort
of see which ones had which scratches, which were the different, oh, that's rapid espanol
and that's-
Oh, wow.
And he enjoyed that, that connected with him somehow.
Yeah, and there was something that it fed into and you're extremely lucky if you have
that and if you can nurture it and can let it grow and let it be an important part of
your life.
So those are the two things is be attentive enough to feel it when it comes.
This is something special.
I mean, I don't know, for example, I really like tabular data, like Excel sheets, it brings
me deep joy.
I don't know how useful that is for anything.
That's part of what I'm talking about, absolutely.
So there's a million, not a million, but there's a lot of things like that for me and you have
to hear that for yourself, realize this is really joyful.
But then the other part that you're mentioning, which is the nurture, is take time and stay
with it, stay with it awhile and see where that takes you in life.
Yeah, and I think the motivational engagement results in the immersion that then creates
the opportunity to obtain the expertise.
So we could call it the Mozart effect.
I mean, when I think about Mozart, I think about the person who was born as the fourth
member of the family's drink quartet and they handed him the violin when he was six weeks
old.
All right, start playing.
The level of immersion there was amazingly profound, but hopefully he also had something,
maybe this is where the more sort of the genetic part comes in sometimes, I think something
in him resonated to the music so that the synergy of the combination of that was so
powerful.
So that's what I really consider to be the Mozart effect.
It's sort of the synergy of something with experience that then results in the unique
flowering of a particular mind.
So I know my siblings and I are all very different from each other.
We've all gone in our own different directions and I mentioned my younger brother who was
very musical.
I had my other younger brother was like this amazing like intuitive engineer and one of
my sisters was passionate about water conservation well before it was such a hugely important
issue that it is today.
So we all sort of somehow find a different thing and I don't mean to say it isn't tied
in with something about us biologically, but it's also when that happens where you can
find that then you can do your thing and you can be excited about it.
So people can be excited about fitting people on bicycles as well as excited about making
neural networks achieve insights into human cognition.
Yeah, like for me personally, I've always been excited about love and friendship between
humans and just like the actual experience of it is to tell as a child just observing
people around me and also been excited about robots.
And there's something in me that thinks I really would love to explore how those two
things combine and it doesn't make any sense.
A lot of it is also timing just to think of your own career and your own life.
You found yourself in certain pieces, places that happen to involve some of the greatest
thinkers of our time.
And so it just worked out that like you guys develop those ideas and there may be a lot
of other people similar to you and they were brilliant and they never found that right
connection and place to where the ideas could flourish.
So it's timing, it's place, it's people.
And ultimately the whole ride, you know, it's undirected.
Can I ask you about something you mentioned in terms of psychiatry when you were younger?
Because I had a similar experience of reading Freud and Carl Jung and just, you know, those
kind of popular psychiatry ideas.
And that was a dream for me early on in high school to like a hope to understand the human
mind by, somehow psychiatry felt like the right discipline for that.
Does that make you sad that psychiatry is not the mechanism by which you are able to
explore the human mind?
So for me, I was a little bit disillusioned because of how much prescription medication
and biochemistry is involved in the discipline of psychiatry as opposed to the dream of the
Freud like use the mechanisms of language to explore the human mind.
So that was a little disappointing.
And that's why I kind of went to computer science and thinking like maybe you can explore
the human mind by trying to build the thing.
Yes, I wasn't exposed to the sort of the biomedical slash pharmacological aspects of psychiatry
at that point because I didn't, I dropped out of that whole idea of premed that I never
even found out about that until much later.
But you're absolutely right.
So I was actually a member of the National Advisory Mental Health Council, that is to
say the board of scientists who advised the director of the National Institute of Mental
Health.
And that was around the year 2000.
And in fact, at that time, the man who came in as the new director, I had been on this
board for a year when he came in said, okay, schizophrenia is a biological illness.
It's a lot like cancer.
We've made huge strides in curing cancer.
And that's what we're going to do with schizophrenia.
We're going to find the medications that are going to cure this disease.
And we're not going to listen to anybody's grandmother anymore.
And good old behavioral psychology is not something we're going to support any further.
And he completely alienated me from the Institute and from all of its prior policies, which
had been much more holistic, I think, really at some level.
And the other people on the board were like psychiatrists, very biological psychiatrists.
It didn't pan out, right?
That nothing has changed in our ability to help people with mental illness.
And so 20 years later, that particular path was a dead end, as far as I can tell.
Well, there's some aspect to and started to romanticize the whole philosophical conversation
about the human mind.
But to me, psychiatrists for a time held the flag of where the deep thinkers, in the same
way that physicists are the deep thinkers about the nature of reality, psychiatrists
are the deep thinkers about the nature of the human mind.
And I think that flag has been taken from them and carried by people like you.
It's more in the cognitive psychology, especially when you have a foot in the computational
view of the world.
Because you can both build it, you can like intuit about the functioning of the mind by
building little models and be able to see mathematical things.
And then deploying those models, especially in computers, to say, does this actually work?
They do little experiments.
And then some combination of neuroscience, where you're starting to actually be able
to observe, do certain experience on human beings and observe how the brain is actually
functioning.
And there, using intuition, you can start being the philosopher, like Richard Feynman
is the philosopher, a cognitive psychologist can become the philosopher.
And psychiatrists become much more like doctors, they're like very medical.
They help people with medication, biochemistry and so on.
But they are no longer the book writers and the philosophers, which of course I admire.
I admire the Richard Feynman ability to do great low-level mathematics and physics and
the high-level philosophy.
Yeah.
I think it was from and young more than Freud that was sort of initially kind of like made
me feel like, oh, this is really amazing and interesting.
And I want to explore it further.
I actually, when I got to college and I lost that thread, I found more of it in sociology
and literature than I did in any place else.
So I took quite a lot of both of those disciplines as an undergraduate.
And I was actually deeply ambivalent about the psychology because I was doing experiments
after the initial flurry of interest in why people would occupy buildings during an insurrection
and consider, be sort of like so over-committed to their beliefs.
But I ended up in the psychology laboratory running experiments on pigeons.
And so I had these profound sort of like dissonance between the kinds of issues that would be
explored when I was thinking about what I read about in modern British literature versus
what I could study with my pigeons in the laboratory.
That got resolved when I went to graduate school and I discovered cognitive psychology.
And so for me, that was the path out of this sort of like extremely sort of ambivalent
divergence between the interest in the human condition and the desire to do actual mechanistically
oriented thinking about it.
And I think we've come a long way in that regard and that you're absolutely right that
nowadays this is something that's accessible to people through the pathway in through computer
science or the pathway in through neuroscience.
You can get derailed in neuroscience down to the bottom of the system where you might
find the cures of various conditions but you don't get a chance to think about the higher
level stuff.
It's in the systems and cognitive neuroscience and computational intelligence miasma up there
at the top that I think these opportunities are most are richest right now.
And so yes, I am indeed blessed by having had the opportunity to fall into that space.
So you mentioned the human condition, speaking of which you happen to be a human being who's
unfortunately not immortal.
That seems to be a fundamental part of the human condition that this right ends.
Do you think about the fact that you're going to die one day?
Are you afraid of death?
I would say that I am not as much afraid of death as I am of degeneration.
I say that in part for reasons of having seen some tragic degenerative situations unfold.
It's exciting when you can continue to participate and feel like you're near the place where
the wave is breaking on the shore if you like.
And I think about my own future potential.
If I were to undergo a begin to suffer from dementia Alzheimer's disease or semantic dementia
or some other condition, I would gradually lose the thread of that ability.
So one can live on for several, for a decade after sort of having to retire because one
no longer has these kinds of abilities to engage.
And I think that's the thing that I fear the most.
The losing of that, the breaking of the way, the flourishing of the mind where you could
have these ideas and they're swimming around, you're able to play with them and collaborate
with other people who are themselves really helping to push these ideas forward.
What about the edge of the cliff, the end, the mystery of it?
The migrated conception of mind and a continuous way of thinking about most things makes it
so that to me, the discreteness of that transition is less apparent than it seems to be to most
people.
I see.
I see.
Yeah.
I wonder if you know the work of Ernest Becker and so on, I wonder what role mortality and
our ability to be cognizant of it and anticipate it and perhaps be afraid of it, what role
that plays in our reasoning of the world.
I think that it can be motivating to people to think they have a limited period left.
I think in my own case, it's like seven or eight years ago now that I was sitting around
doing experiments on decision making that were satisfying in a certain way because I
could really get closure on whether the model fit the data perfectly or not.
I could see how one could test the predictions in monkeys as well as humans and really see
what the neurons were doing, but I just realized, hey, wait a minute.
I may only have about 10 or 15 years left here.
I don't feel like I'm getting towards the answers to the really interesting questions
while I'm doing this particular level of work.
That's when I said to myself, okay, let's pick some
thing that's hard.
That's when I started working on mathematical cognition.
I think it was more in terms of, well, I got 15 more years possibly of useful life left.
Let's imagine that it's only 10.
I'm actually getting close to the end of that now, maybe three or four more years, but I'm
beginning to feel like, well, I probably have another five after that, so okay, I'll give
myself another six or eight.
But a deadline is looming.
It's not going to go on forever, so yeah, I got to keep thinking about the questions
that I think are the interesting and important ones for sure.
What do you hope your legacy is?
You've done some incredible work in your life as a man, as a scientist, when the aliens
and the human civilization is long gone and the aliens are reading the encyclopedia about
the human species.
What do you hope is the paragraph written about you?
I would want it to highlight a couple things that I was able to see one path that was more
exciting to me than the one that seemed already to be there for a cognitive psychologist,
but not for any super special reason other than that I'd had the right context prior
to that, but that I had gone ahead and followed that lead.
Then I forget the exact wording, but I said in this preface that the joy of science is
the moment in which a partially formed thought in the mind of one person gets crystallized
a little better in the discourse and becomes the foundation of some exciting concrete piece
of actual scientific progress.
I feel like that moment happened when Rommelhardt and I were doing the interactive activation
model and when Rommelhardt heard Hinton talk about gradient descent and having the objective
function to guide the learning process.
It happened a lot in that period and I seek that thing in my collaborations with my students.
The idea that this is a person who contributed to science by finding exciting collaborative
opportunities to engage with other people through is something that I certainly hope
is part of the paragraph.
Like you said, taking a step maybe in directions that are non-obvious, so it's the old Robert
Frost road less taken, so maybe because you said this incomplete initial idea, that step
you take is a little bit off the beaten path.
If I could just say one more thing here.
This was something that really contributed to energizing me in a way that I feel it would
be useful to share.
My PhD dissertation project was completely empirical experimental project and I wrote
a paper based on the two main experiments that were the core of my dissertation and I
submitted it to a journal.
At the end of the paper, I had a little section where I laid out the beginnings of my theory
about what I thought was going on that would explain the data that I had collected and
I had submitted the paper to the Journal of Experimental Psychology, so I got back a letter
from the editor saying, thank you very much, these are great experiments and we'd love
to publish them in the journal, but what we'd like you to do is to leave the theorizing
to the theorists and take that part out of the paper.
So I did, I took that part out of the paper, but I almost found myself labeled as a non-theorist
right by this and I could have succumbed to that and said, okay, well, I guess my job
is to just go on and do experiments, but that's not what I wanted to do and so when
I got to my assistant professorship, although I continued to do experiments because I knew
I had to get some papers out, I also at the end of my first year submitted my first article
to Psychological Review, which was the theoretical journal where I took that section and elaborated
it and wrote it up and submitted it to them and they didn't accept that either, but they
said, oh, this is interesting, you should keep thinking about it this time and then
that was what got me going to think, okay, so it's not a superhuman thing to contribute
to the development of theory, you don't have to be, you can do it as a mere mortal.
And the broader, I think, lessons don't succumb to the labels of a particular reviewer.
Or anybody labeling you, right?
Yeah, exactly.
I mean, yeah, exactly, and especially as you become successful, your labels get assigned
to you for that, you're successful for that thing.
Yeah, I'm a connectionist or a cognitive scientist and not a neuroscientist, whatever.
You can completely, that's just, that's the stories of the past, you're today a new person
that can completely revolutionize in totally new areas, so don't let those labels hold
you back.
Let me ask the big question.
When you look into it, you said it started with Columbia trying to observe these humans
and they're doing weird stuff and you want to know why are they doing this stuff?
So zoom out even bigger.
At the 100 plus billion people who've ever lived on earth, why do you think we're all
doing what we're doing?
What do you think is the meaning of it all, the big why question?
We seem to be very busy doing a bunch of stuff and we seem to be kind of directed towards
somewhere, but why?
Well, I myself think that we make meaning for ourselves and that we find inspiration
in the meaning that other people have made in the past, you know, and the great religious
thinkers of the first millennium BC and, you know, few that came in the early part of the
second millennium, you know, laid down some important foundations for us.
But I do believe that, you know, we are an emergent result of a process that happened
naturally without guidance and that meaning is what we make of it and that the creation
of efforts to reify meaning in like religious traditions and so on is just a part of the
expression of that goal that we have to, you know, not find out what the meaning is, but
to make it ourselves.
And so, to me, it's something that's very personal, it's very individual, it's like
meaning will come for you through the particular combination of synergistic elements that are
your fabric and your experience and your context and, you know, you should...
It's all made in a certain kind of a local context, though, right?
So, here I am at UCSD with this brilliant man, Rommel Hart, who's having, you know, these
doubts about symbolic artificial intelligence that resonate with my desire to see it grounded
in the biology and let's make the most of that, you know?
Yeah, and so, from that, like little pocket, there's some kind of peculiar little emergent
process that then, which is basically each one of us, each one of us humans is a kind
of, you know, you think cells and they come together and it's an emergent process that
then tells fancy stories about itself and then gets, just like you said, just enjoys
the beauty of the stories we tell about ourselves.
It's an emergent process that lives for time, is defined by its local pocket and context
in time and space and then tells pretty stories and we write those stories down and then we
celebrate how nice the stories are and then it continues because we build stories on top
of each other and eventually we'll colonize, hopefully, other planets, other solar systems,
other galaxies and we'll tell even better stories.
It all starts here on earth, Jay, you're speaking of peculiar emergent processes that lived
one heck of a story.
You're one of the great scientists of cognitive science, of psychology, of computation.
It's a huge honor you would talk to me today that you spend your very valuable time.
I really enjoy talking with you and thank you for all the work you've done.
I can't wait to see what you do next.
Well, thank you so much and this has been an amazing opportunity for me to let ideas
that I've never fully expressed before come out because you asked such a wide range of
the deeper questions that we've all been thinking about for so long.
So thank you very much for that.
Thank you.
Thanks for listening to this conversation with Jay McClelland.
To support this podcast, please check out our sponsors in the description.
And now, let me leave you with some words from Jeffrey Hinton.
In the long run, curiosity-driven research works best.
Real breakthroughs come from people focusing on what they're excited about.
Thanks for listening and hope to see you next time.