Judea Pearl: Causal Reasoning, Counterfactuals, and the Path to AGI

Lex Fridman Podcast

Conversations about science, technology, history, philosophy and the nature of intelligence, consciousness, love, and power. Lex is an AI researcher at MIT and beyond. Conversations about science, technology, history, philosophy and the nature of intelligence, consciousness, love, and power. Lex is an AI researcher at MIT and beyond.

Transcribed podcasts: 442
Time transcribed: 44d 12h 13m 31s

results.

Mention graph

This graph shows how many times the word ______ has been mentioned throughout the history of the program.

The following is a conversation with Judea Pearl,
a professor at UCLA and a winner of the Touring Award
that's generally recognized as the Nobel Prize of Computing.
He's one of the seminal figures in the field of artificial intelligence,
computer science, and statistics. He has developed and championed
probabilistic approaches to AI, including Beijing networks,
and profound ideas and causality in general. These ideas are important not
just to AI, but to our understanding and practice of science.
But in the field of AI, the idea of causality, cause and effect,
to many, lie at the core of what is currently missing
and will must be developed in order to build truly intelligent systems.
For this reason, and many others, his work is worth returning to
often. I recommend his most recent book called Book of Why that presents key
ideas from a lifetime of work in a way that is accessible to the general public.
This is the Artificial Intelligence Podcast. If you enjoy it,
subscribe on YouTube, give it 5 stars on Apple Podcasts,
support on Patreon, or simply connect with me on Twitter.
Alex Friedman, spelled F-R-I-D-M-A-N. If you leave a review on Apple Podcasts
especially, but also Cast Box or comment on YouTube, consider mentioning
topics, people ideas, questions, quotes, and science, tech, and philosophy
that you find interesting, and I'll read them on this podcast.
I won't call out names, but I love comments with kindness and thoughtfulness
in them, so I thought I'd share them with you. Someone on YouTube
highlighted a quote from the conversation with Noam Chomsky,
where he said that the significance of your life is something you create.
I like this line as well. On most days, the existentialist approach to life
is one I find liberating and fulfilling.
I recently started doing ads at the end of the introduction. I'll do one or two
minutes after introducing the episode and never any ads in the middle
that break the flow of the conversation. I hope that works for you
and doesn't hurt the listening experience. This show is presented by Cash App,
the number one finance app in the App Store. I personally use Cash App to
send money to friends, but you can also use it to buy, sell, and deposit
Bitcoin in just seconds. Cash App also has a new investing
feature. You can buy fractions of a stock, say $1 worth,
no matter what the stock price is. Brokerage services are provided by Cash App
Investing, a subsidiary of Square, and member SIPC.
I'm excited to be working with Cash App to support one of my favorite
organizations called First, best known for their first robotics and
Lego competitions. They educate and inspire hundreds of
thousands of students in over 110 countries
and have a perfect rating on charity navigator, which means the donated money
is used to the maximum effectiveness. When you get Cash App from the App Store,
Google Play, and use code LEX Podcast, you'll get $10 and
Cash App will also donate $10 to First, which again is an organization that
I've personally seen inspire girls and boys to dream of engineering a better
world. And now, here's my conversation with Judea
Pearl. You mentioned in an interview that science
is not a collection of facts by constant human struggle
with the mysteries of nature. What was the first mystery that you can recall that
hooked you, that kept you from the curiosity? Oh, the first mystery, that's a good one.
Yeah, I remember that. I had a fever for three days.
When I learned about the cart, analytic geometry,
and I found out that you can do all the construction in geometry using algebra,
and I couldn't get over it. I simply couldn't get out of bed.
What kind of world does analytic geometry unlock?
Well, it connects algebra with the geometry.
Okay, so the cart had the idea that geometrical construction and
geometrical theorems and assumptions can be articulated in the language of
algebra, which means that all the proof that we did in
high school trying to prove that the three bisectors
meet at one point and that, okay, all these can be proven
by just shuffling around notation. Yeah, that was a
traumatic experience. For me it was, I'm telling you.
So it's the connection between the different mathematical disciplines
that they all... Not in between two different languages.
Languages. Yeah. So which
mathematic discipline is most beautiful? Is geometry it for you? Both are
beautiful. They have almost the same power.
But there's a visual element to geometry being a visual. It's more
transparent. But once you get over to algebra, then the
linear equation is a straight line. This translation is easily absorbed.
And to pass a tangent to a circle, you know, you have the basic theorems
and you can do it with algebra. But the transition from one to another
was really, I thought that Descartes was the greatest mathematician of all times.
So you have been at the, if you think of engineering
and mathematics as a spectrum. Yes. You have been,
you have walked casually along this spectrum throughout your life.
You know, a little bit of engineering and then, you know,
you've done a little bit of mathematics here and there. Not a little bit. I mean,
we got a very solid background in mathematics because our teachers were
geniuses. Our teachers came from Germany in the 1930s,
running away from Hitler. They left their careers in Heidelberg
and Berlin and came to teach high school in Israel.
And we were the beneficiary of that experiment.
So I, and they taught us math the good way. What's a good way to teach math?
Chronologically. The people. The people behind the
theorems, yeah. Their cousins and their nieces and their faces.
And how they jumped from the bathtub when they screamed Eureka
and ran naked in town. So you're almost educated as a historian of
math? No, we just got a glimpse of that history
together with the theorems. So every exercise in math was connected
with the person and the time of the person.
The period. The period also mathematically.
Mathematically speaking, yes. Not the politics.
So, and then in university, you have, you've gone on to do
engineering. Yeah. I get the BS in engineering
and techno. And then I moved here for
graduate work and I did engineering in addition to physics
in Radge. And it would combine very nicely with my
thesis, which I did in LCA laboratories and superconductivity.
And then somehow thought to switch to almost computer science, software, even
even not switch, but long to become, to get into
software engineering a little bit. Yes.
Programming, if you can call that in the 70s. So there's all these
disciplines. Yeah. If you were to pick a favor,
in terms of engineering and mathematics, which path do you think
has more beauty? Which path has more power? It's hard to choose, no.
I enjoy doing physics. I even have a vortex named on my name.
So I have investment in immortality.
So what is a vortex? Vortex is in superconductivity.
In the superconductivity. You have permanent current swirling around.
One way or the other, you can have a store one or zero
for computer. That's what we worked on in the 1960s in LCA.
And I discovered a few nice phenomena with the vortices.
They pushed current and they moved. Pearl vortex.
Pearl vortex, why you can Google it? Right. I didn't know about it, but the
physicists picked up on my thesis, on my PhD thesis, and
it becomes popular. I mean, thin film superconductors became
important for high temperature superconductors.
So they call it pearl vortex without my knowledge.
I discovered it only about 15 years ago. You have footprints in all of the
sciences. So let's talk about the universe a
little bit. Is the universe at the lowest level
deterministic or stochastic in your amateur philosophy view?
Put another way, does God play dice? Well, we know it is stochastic, right?
Today, today we think it is stochastic. Yes, we think because we have the
Heisenberg uncertainty principle and we have some
experiments to confirm that. All we have is experiments to confirm it.
We don't understand why. Why is already...
You wrote a book about why. Yeah, it's a puzzle.
It's a puzzle that you have the dice flipping
machine or God and the result of the flipping
propagate with the speed faster than the speed of light.
We can't explain it. But it only governs microscopic phenomena.
So you don't think of quantum mechanics as useful
for understanding the nature of reality? No, diversionary.
So in your thinking, the world might as well be deterministic?
The world is deterministic and as far as the new one firing is concerned,
it is deterministic to first approximation. What about free will?
Free will is also a nice exercise. Free will is an illusion
that we AI people are going to solve. So what do you think once we solve it, that
solution will look like once we put it in the page?
The solution will look like, first of all, it will look like a machine.
A machine that acts as though it has free will.
It communicates with other machines as though they have
free will and you wouldn't be able to tell the difference between
a machine that does and machine that doesn't have free will.
So the illusion, it propagates the illusion of free will amongst the other
machines. And faking it is having it.
Okay, that's what Tullin tells us all about. Faking intelligence is intelligent
because it's not easy to fake. It's very hard to fake
and you can only fake if you have it.
That's such a beautiful statement. Yeah, you can fake it if you
don't have it. So let's begin at the beginning
with probability both philosophically and mathematically. What does it mean to
say the probability of something happening is 50 percent?
What is probability? It's a degree of uncertainty
that an agent has about the world. You're still expressing some knowledge in
that statement. Of course. If the probability is 90 percent,
it's absolutely different kind of knowledge and if it is 10 percent.
But it's still not solid knowledge. It's solid knowledge, but hey,
if you tell me that 90 percent assurance smoking will give you lung
cancer in five years versus 10 percent, it's a piece of
useful knowledge. So the statistical view of the universe,
why is it useful? So we're swimming in complete uncertainty,
most of everything around us. It allows you to predict things with
a certain probability and computing those probabilities are very useful.
That's the whole idea of prediction and you need
prediction to be able to survive. If you can't predict the future, then
you're just crossing the street. It will be extremely
fearful. And so you've done a lot of work in causation
and so let's think about correlation. I started with probability.
You started with probability. You've invented the
Bayesian networks and so we'll dance back and forth
between these levels of uncertainty. But what is correlation?
So probability is something happening, is something,
but then there's a bunch of things happening and
sometimes they happen together, sometimes not, they're independent or not.
So how do you think about correlation of things? Correlation occurs when two
things vary together over a very long time. It's one way of measuring it.
Or when you have a bunch of variables that they all vary
aggressively, then we have a correlation here.
And usually when we think about correlation, we really think
causally. Things that cannot be correlated unless there is a reason
for them to vary together. Why should they vary together?
If they don't see each other, why should they vary together?
So underlying it somewhere is causation. Yes.
Hidden in our intuition there is a notion of causation because we cannot grasp
any other logic except causation. And how does conditional probability
differ from causation? So what is conditional probability?
Conditional probability, how things vary, when one of them
stays the same. Now staying the same means that I have chosen to look only
of those incidents where the guy has the same value
as previous one. It's my choice as an experimenter.
So things that are not correlated before could become correlated.
Like for instance if I have two coins which are uncorrelated
and I choose only those flippings experiments in which a bell rings
and a bell rings when at least one of them is a tail.
Then suddenly I see correlation between the two coins.
Because I only look at the cases where the bell rang.
It's my design, with my ignorance essentially,
with my audacity to ignore certain incidents.
I suddenly create a correlation where it doesn't exist physically.
Right, so you just outlined one of the flaws
of observing the world and trying to infer something from the myth
about the world from looking at the correlation.
I don't look at it as a flaw. The world works like that.
But the flaws come if we try to impose
causal logic on correlation. It doesn't work too well.
I mean but that's exactly what we do. That's what that has been the majority
of science. The majority of naive science.
The statisticians know it. The statisticians know that if you
condition on a third variable then you can destroy
or create correlations among two other variables.
They know it. It's in the data. It's nothing surprising.
That's why they all dismiss the symptom paradox.
Ah, we know it. They don't know anything about it.
Well, there's disciplines like psychology where
all the variables are hard to account for and so
oftentimes there's a leap between correlation to causation.
You're imposing. What do you mean a leap?
Who is trying to get causation from correlation?
You're not proving causation but you're sort of
discussing it, implying sort of hypothesizing without ability to prove it.
Which discipline do you have in mind? I'll tell you if they are
absolute or if they are outdated or they are about to get outdated or
tell me which one you have. Well, psychology, you know.
What is it? SEM? Structural equation? No, no. I was thinking of applied
psychology studying. For example, we work with human behavior
in semi-autonomous vehicles. How people behave and you have to conduct
these studies of people driving cars. Everything starts with a
question. What is the research question? What is the research question?
The research question, do people fall asleep
when the car is driving itself? Do they fall asleep or do they tend to fall
asleep more frequently? More frequently. Then the car not driving?
No, it's not driving itself. That's a good question, okay?
And so you measure, you put people in the car
because it's real world. You can't conduct an experiment where you control
everything. Why can't you? You could. Turn the automatic
module on and off. Because it's on road public. I mean there's
you have, there's aspects to it that's unethical
because it's testing on public roads. So you can only use vehicle.
They have to, the people, the drivers themselves have to make that choice
themselves. And so they regulate that. And so you just
observe when they drive it autonomously and when they don't.
And then. But maybe they turn it off when they were very tired.
Yeah, that's kind of thing. But you, you don't know those variables.
Okay, so that you have now uncontrolled experiments.
Uncontrolled experiments. We call it observational study.
And we, from the correlation detected, we have to infer causal
relationship. Whether it was the automatic piece
had caused them to fall asleep. Oh, okay. So that is an
issue that is about 120 years old. Yeah.
I should only go 100 years old. Okay.
And, oh, maybe it's no, actually I should say it's 2000 years old
because we have this experiment by Daniel. But the Babylonian king
that wanted the exile, the people from Israel that were taken
in exile to Babylon to serve the king. He wanted to serve them king's food
which was meat. And Daniel as a good Jew couldn't
eat non-kosher food. So he asked them to eat vegetarian food.
But the king overseers says, I'm sorry, but if the king sees that your
performance falls below that of other kids, you know, he's going to kill me.
Daniel said, let's make an experiment. Let's take four of us from Jerusalem.
Okay, give us vegetarian food. Let's take the other guys that
to eat the king's food in about a week's time
will test our performance. And you know the answer?
Of course, he did the experiment. And they were
so much better than the others. And the kings nominated them
to super position in his king. So it was the first experiment, yes.
So there was a very simple, it's also the same
research questions. We want to know vegetarian food, assist or obstruct
your mental ability. And the question is very old one.
Even a democrat said, if I could discover one cause
of things, I would rather discuss one cause and be a king of Persia.
The task of discovering causes was in the mind of ancient people
from many, many years ago. But the mathematics
of doing that was only developed in the 1920s.
So science has left us often. Science has not provided us
with the mathematics to capture the idea of
x causes y and y does not cause x. Because all the questions of physics
are symmetrical, algebraic. The equality sign goes both ways.
Okay, let's look at machine learning. Machine learning today, if you look at
deep neural networks, you can think of it as
kind of conditional probability estimators.
Beautiful. So where did you say that? Conditional probability estimators.
None of the machine learning people clubbed you, attacked you.
Listen, most people, and this is why this today's conversation I think is
interesting, is most people would agree with you.
There are certain aspects that are just effective today, but we're going to hit a
wall and there's a lot of ideas. I think you're very right that we're
going to have to return to about causality.
Let's try to explore it. Let's even take a step back. You've
invented Bayesian networks that look awfully a lot like they
express something like causation, but they don't, not necessarily.
So how do we turn Bayesian networks into
expressive causation? How do we build causal networks?
A causes B, B causes C. How do we start to infer that kind of thing?
We start asking ourselves a question. What are the factors
that would determine the value of X? X could be blood pressure,
death, hunger. But these are hypotheses that we propose.
I protest this. Everything which has to do with causality comes from a theory.
The difference is only what kind, how you interrogate the theory you have in
your mind. So it still needs the human expert to
propose. Right. You need the human expert to specify
the initial model. Initial model could be very
qualitative. Just who listens to whom? By whom listen to I mean
one variable, listen to the other. So I say, okay, the tide is listening to the
moon and not to the rooster crow.
And so forth. This is our understanding of the world in which we live.
Scientific understanding of reality. We have to start there.
Because if we don't know how to handle
cause and effect relationship, when we do have a model
and we certainly do not know how to handle it when we don't have a model.
So let's start first. In AI's slogan is representation first,
discovery second. But if I give you all the information that you need,
can you do anything useful with it? That is the first representation.
How do you represent it? I give you all the knowledge in the world. How do you
represent it? When you represent it, I ask you,
can you infer x or y or z? Can you answer certain queries?
Is it complex? Is it polynomial? All the computer science exercises
we do once you give me a representation for my knowledge.
Then you can ask me, now I understand how to represent things.
How do I discover them? At the second everything.
So first of all, I should echo the statement that mathematics and the
current, much of the machine learning world
has not considered causation that A causes B.
Just in anything. That seems like a
non-obvious thing that you think we would have
really acknowledged it but we haven't. So we have to put that on the table.
So knowledge, how hard is it to create a knowledge from which
to work? In certain area, it's easy because we have
only four or five major variables and an epidemiologist or an economist can
put them down. Minimum wage, unemployment,
policy, x, y, z and start collecting data
and quantify the parameters that were left
unquantified with the initial knowledge. That's the
routine work that you find in experimental psychology,
in economics, everywhere, in the health science.
That's a routine thing. But I should emphasize, you should start with a
research question. What do you want to estimate?
Once you have that, you have to have a language of expressing what you want to
estimate. You think it's easy? No. So we can talk
about two things. I think one is how the science of causation
is very useful for answering certain questions
and then the other is how do we create intelligent systems
that need to reason with causation? So if my research question is how do I pick
up this water bottle from the table? All the
knowledge is required to be able to do that.
How do we construct that knowledge base? Do we return back to the problem
that we didn't solve in the 80s with expert systems? Do we have to solve that
problem of automated construction of knowledge?
You're talking about the task of eliciting knowledge from an expert.
Task of eliciting knowledge from an expert or the self-discovery of
more knowledge, more and more knowledge. So automating the building of
knowledge as much as possible. It's a different game in the causal
domain because it's essentially the same thing. You
have to start with some knowledge and you're trying to enrich it.
But you don't enrich it by asking for more rules. You enrich it by asking for
the data, to look at the data and quantifying
and ask queries that you couldn't answer when you started.
You couldn't because the question is quite complex and it's not
within the capability of ordinary cognition, of ordinary
person or ordinary expert even, to answer. So what kind of questions do you
think we can start to answer? Even a simple one.
Suppose I start with an easy one. Let's do it.
What's the effect of a drug on recovery?
What is the aspirin that caused my headache to be cured? Or what is the
television program? What's the good news I received?
This is already a difficult question because it's
finding cause from effect. The easy one is finding effects from cause.
That's right. So first you construct a model saying that this is an important
research question. This is an important question.
I didn't construct a model yet. I just said it's an important question.
And the first exercise is express it mathematically.
What do you want to do? Like if I tell you what will be the effect
of taking this drug? You have to say that in mathematics.
How do you say that? Can you write down the question?
Not the answer. I want to find the effect of the drug
on my headache. Write down. Write it down. That's where the
do calculus comes in. Yes. Do operator. What do you do?
Do operator. Yeah. Which is nice. It's the difference between association
and intervention. Very beautifully sort of constructed.
Yeah. So we have a do operator. So the do calculus connected on the do
operator itself connects the operation of doing
to something that we can see. Right. So as opposed to the purely observing
you're making the choice to change a variable. That's what it
expresses. And then the way that we interpret it
the mechanism by which we take your query
and we translate it into something that we can work with
is by giving it semantics. Saying that you have a model of the world
and you cut off all the incoming error into X
and you're looking now in the modified mutilated model
you ask for the probability of Y. That is interpretation of doing X.
Because by doing things you've liberated them
from all influences that acted upon them
earlier and you subject them to the tyranny of your muscles.
So you remove all the questions about causality by doing them.
So there's one level of questions. Yeah. Answer questions about what will happen
if you do things. If you do. If you drink the coffee if you
take the asthma. Right. So how do we get the
once how do we get the doing data. Now the question is if we cannot one
experiment right then we have to rely on observational study.
So first we could start to interrupt. We could run an experiment
where we do something where we drink the coffee and don't and this
the the do operator allows you to sort of be systematic about expressing.
To imagine how the experiment will look like even though we cannot
physically and technologically conduct it. I'll give you an example.
What is the effect of blood pressure on mortality.
I cannot go down into your vein and change your blood pressure.
But I can ask the question. Which means I can even have a model of your body.
I can imagine the effect of your how the blood pressure
change will affect your mortality. How I go into the model and I conduct this
surgery about the blood pressure even though
physically I can do I cannot do it. Let me ask the quantum mechanics question.
Does the doing change the observation. Meaning the surgery of changing the
blood pressure is I mean no the surgery is
very delicate. Very delicate. Incisive and delicate.
Which means do x means I'm going to touch only x.
Directly into x. So that means that I change only things which
depends on x by virtue of x changing. But I don't depend things
which are not depends on x like I wouldn't
change your sex or your age. I just change your blood pressure.
So in the case of blood pressure it may be difficult or impossible to construct
such an experiment. No physically yes but hypothetically no.
Hypothetically no. If we have a model that is what the model is for.
So you conduct surgeries on a model you take it apart put it back
that's the idea of a model. It's the idea of thinking counter factually
imagining and that's the idea of creativity.
So by constructing that model you can start to infer
if the higher the blood pressure leads to mortality
which increases or decreases. I construct the model I can still
not answer it. I have to see if I have enough information in the model that
would allow me to find out the effects of intervention
from a non-interventional study from observation hands-off study.
So what's needed? You need to have assumptions about who
affects whom. If the if the graph had a certain property
the answer is yes you can get it from observational study.
If the graph is too mushy bushy bushy the answer is no you cannot.
Then you need to find either different kind of observation that you haven't
considered or one experiment. So basically does that
that puts a lot of pressure on you to encode wisdom into that graph?
Correct. But you don't have to encode
more than what you know. God forbid if you put the like economists are doing that
they call identifying assumptions they put assumptions even they don't
prevail in the world they put assumptions so they can
identify things. But the problem is yes beautifully put but the problem is you
don't know what you don't know. So you know what you don't know because
if you don't know you say it's possible it's possible
that x affect the traffic tomorrow. It's possible. You put down an error
which says it's possible. Every error in the graph
says it's possible. So there's not a significant cost
to adding arrows that. The more error you add
the better. The less likely you are to identify things from
purely observational data. So if the whole world is bushy
and everybody affects everybody else the answer is
you can answer it ahead of time. I cannot
answer my query from observational data. I have to go to experiments.
So you talk about machine learning is essentially
learning by association or reasoning by association
and this due calculus is allowing for intervention
like that word but action. So you also talk about counterfactuals
and trying to sort of understand the difference in counterfactuals and
intervention. What's the first what is counterfactuals and
why are they useful? Why are they especially useful as
opposed to just reasoning what what affect actions have?
What kind of factual contains what we normally call explanations. Can you
give an example? If I tell you that acting one way affects
something I didn't explain anything yet but if I if I
ask you was it the aspirin that cure my headache
I'm asking for explanation what cure my headache
and putting a finger on aspirin
provide explanation. It was aspirin that was responsible
for your headache going away. If if you didn't take the aspirin you
would still have a headache. So by by saying if I didn't take
aspirin I would have a headache you're thereby saying
that aspirin is the thing that removes the headache.
Yes but you have to have another important information.
I took the aspirin and my headache is gone.
It's very important information. Now I'm reasoning backward and I said
what is the aspirin? Yeah by considering what would have happened
if everything else is the same but I didn't take aspirin. That's right. So you
know that things took place you know Joe killed Schmo
and Schmo would would be alive had John not used his gun.
Okay so that is the counterfactual. It had a conflict here or
clash between observed fact
that he he did shoot okay and the hypothetical
predicate which says had he not shot you have a clash
logical clash they cannot exist together.
That's the counterfactual and that is the source of our
explanation of our the idea of responsibility
regret and free will. Yeah so it certainly seems
that's the highest level of reasoning right. Yes and physicists do it all the
time. Who does it all the time? Physicists. Physicists.
In every equation of physics let's say you have a hook's law
and you put one kilogram on the spring and the spring is
one meter and you say had this weight been two
kilogram the spring would have been twice as long.
It's no problem for physicists to say that
except that mathematics is only is in the form of equation
okay equating the weight proportionality constant
and the length of the string. So you don't have the
asymmetry in the equation of physics although every physicist
thinks counterfactually. Ask high school kids
had the weight been three kilograms what will be the length of the spring
they can answer it immediately because they do the counterfactual
processing in their mind and then they put it into
equation algebraic equation and they solve it okay but the robot cannot do
that. How do you make a robot learn these
relationships? Why you would learn? Suppose you
tell him can you do it? So before you go learning
yeah you have to ask yourself suppose that gives more information
can the robot perform a task that I ask him to perform
can he reason and say no it wasn't the aspirin
it was the good news you received on the phone.
Right because well unless the robot had a model a causal model of the world
right right I'm sorry I have to linger on this but now we have to linger and we
have to say how how do we do it how do we build yes how do we build a
causal model without a team of human experts
running around why don't you go to learning right away
you're too much involved with learning because I like babies babies learn fast
and hard they do it good yeah that's another question
how do the babies come out with the counterfactual model of the world
and babies do that yeah they know how to play with
in the crib they know which balls hits another one
and so they learn it by playful manipulation of the world
yes they're simple world involved only toys and balls and
chimes but it's a if you think about it's a complex
world we take for granted yeah how complicated and
kids do it by playful manipulation plus parents guidance
pure wisdom and hearsay they meet each other can they say
you you shouldn't have taken my toy right but and they
these multiple sources of information they're able to integrate
so the challenge is about how to integrate
how to form these causal relationship from different sources of data
correct so how how how much is information is it to play
how much causal information is required to be able to play
in the crib with different objects I don't know I haven't
experimented with the crib okay not a crib picking up very interesting manipulating
physical objects on this very opening the pages of a book
all the tasks the physical manipulation tasks
do you have a sense because my sense is the world is extremely complicated
it's totally complicated I agree and I don't know how to organize it because
I've been spoiled by easy problems such as
cancer and death okay first we have to start trying to
easy the easiest sense that you have only 20 variables
and they are just variables are not mechanics
okay it's easy you just put them on the graph and they
they speak to you yeah and you you're providing a methodology for
for letting them speak yeah I'm working only in the abstract
the abstract was knowledge in knowledge out
data in between now can we take a leap to trying to learn
in this very when it's not 20 variables but 20 million variables
trying to learn causation in this world
not learn but some how construct models I mean it seems like you would only
have to be able to learn because constructing it
manually would be too difficult do you have ideas of
I think it's a matter of combining simple models
for many many sources for many many disciplines
and many metaphors metaphors are the basics of human intelligence
basis yeah so how do you think of about a metaphor
in terms of its use in human intelligence metaphors is an expert system
an expert it's mapping problem
with which you are not familiar to a problem with which you are familiar
like I give you a good example the greek believed that the sky
is an opaque shell it's not really an infinite space
it's an opaque shell and the stars are holes
poked in the shells through which you see the eternal light
it was a metaphor why because they understand how you poke holes in the
shells okay they're not they were not familiar with
infinite space okay and so and and we are walking on a
shell of a turtle and if you get too close to the edge you're gonna fall down
to Hades or whatever yeah and that's a metaphor
it's not true but this kind of metaphor enabled Aristoteles
to measure the radius of the earth because he said come on if the we are
working on a turtle shell then the ray of light
coming to this angle will be different this place
will be different angle that coming to this place I know the distance I'll
measure the two angles and then I have the radius of the shell of the
of the turtle okay and he did and he found
his measurements were very close to the measurements we have
today through the year what six thousand and seven hundred
seven hundred kilometers of the earth that's something that would not occur
to Babylonian astronomer even though the Babylonian experiments were the
machine learning people of the time they fit curves and they could predict
the eclipse of the moon much more accurately than the greek
because they fit curve okay that's a different metaphor
something that you're familiar with a game a turtle shell
what does it mean if you are familiar familiar means that answers to certain
questions are explicit you don't have to derive them
and they were made explicit because somewhere in the past
you've constructed a model of that you're familiar with so the child is
familiar with billiard balls yes so the child could predict that if you
let loose of one ball the other one will bounce off
these are you you obtained that by
familiarity familiarity is answering questions
and you stole the answer explicitly you don't have to derive them
so this is ideal for metaphor all our life all our intelligence
is built around metaphors mapping from the unfamiliar to the familiar
but the marriage between the two is a tough thing which i
which we haven't yet been able to algorithmize
so you think of that process of because of using metaphor to leap from one
place to another we can call it reasoning
is it a kind of reasoning it is reasoning by metaphor metaphor
metaphor do you think of that as learning
so learning is a popular terminology today in a narrow sense
it is it is it is definitely so you may not
okay right it's one of the most important learning
taking something which theoretically is derivable
and store it in accessible format i'll give you an example chess
okay finding winning winning starting moving chess
is hard but uh it is there is an answer
either there is a winning move for white or there isn't or there is a draw
okay so it is the answer to that is available
for the rule of the games but we don't know the answer
so what does the chess master have that we don't have
he has stored explicitly an evaluation of certain complex pattern of the board
we don't have it ordinary people like me i don't know about you
i'm not a chess master so for me i have to derive
yes things that for him is explicit he has seen it before
or you've seen the pattern before or similar pattern you see metaphor
yeah and he generalized and said don't move is a dangerous move
it's just that not in the game of chess but in the game of
billiard balls we humans are able to initially derive very effectively and
then reason by metaphor very effectively and make it look so easy
that it makes one wonder how hard is it to build it in a machine
so in your sense how far away are we to be able to construct
i don't know i'm not a futurist i can all i can tell you is
that we are making tremendous progress in the causal reasoning
a domain something that i even
dare to call it revolution the causal revolution
because what we have achieved in the past three decades
is something that
dwarf everything that was derived in the entire
history so there's an excitement about current machine learning
methodologies and there's really important good work you're doing
in causal inference where do the word what is the future
where do these worlds collide and what does that look like
first they're gonna work without collisions
it's gonna work in harmony harmony it's not the human is going to
to jumpstart the exercise by providing
qualitative non-committing models of how the universe works
we have how they in reality the domain of discourse
works the machine is going to take over from that point of view
and derive whatever the calculus says can be derived
namely quantitative answer to our questions
these are complex questions i give you some example of complex question
that will bugle your mind if you think about it
you take result of studies in diverse population under diverse
condition and you may infer the cause effect
of a new population which doesn't even
resemble any of the one studied and you do that
by do calculus you do that by generalizing
from one study to another see what's what's common with Beato
what is different let's ignore the differences
and pull out the commonality and you do it over maybe a hundred hospitals
around the world from that you can get really
mileage from big data it's not only do you have many samples
you have many sources of data so that that's a really powerful thing
and i think for especially for medical applications i mean
cure cancer right that's how from data you can cure cancer
so we're talking about causation which is the temporal
temporal relationships between things not only temporal it was structural and
temporal temporal enough temporal presence by itself
cannot replace causation is temporal precedence the
error of time in physics it's important necessary
it's important to fish it yes is it yes
i never seen the cause propagate backward but if we call if we use the word
cause but there's relationships that are timeless
i suppose that's still forward in the era of time but
the are there relationships logical relationships
that fit into the structure sure the whole do calculus is logical
relationship that doesn't require a temporal it
has just a condition that it's you're not traveling back in time
yes correct so it's really a generalization of
a powerful generalization of what boolean logic
yeah boolean logic yes that is sort of simply put and allows us
to you know reason reason about the order of events the source the
not about which means we're not deriving the order of event
we are given cause-effect relationship okay
they ought to be obeying the the time precedence relationship
we are given that and now that we ask questions about
other causal relationship that could be derived from the
initial ones but were not given to us explicitly
yeah like the case of the firing squad i gave you
in the first chapter and i ask what if rifleman a
declined to shoot would the prisoners still be dead
to decline to shoot it means that he disobey order
and the the rule of the games were that he is a
obedient and marksman okay that's how you start that's the initial
order but now you ask question about breaking the rules
what if he decided not to pull the trigger
he just became a pacifist and you can you and i can answer that
the other rifleman would have killed him okay
i want the machine to do that is it so hard to ask machine to do that
it's such a simple task no but if they have a calculus for that
yes yeah but the curiosity the natural curiosity for me is that yes you're
absolutely correct and important and uh it's hard to
believe that we haven't done this seriously uh extensively
already a long time ago so this this is really important work but
i also want to know you know this maybe you can
philosophize about how hard is it to learn okay let's assume a learning we
want to learn it okay want to learn so what do we do
we put a learning machine that watches execution
trials in many countries and many
locations okay all the machine can learn is to see
shot or not shot dead not dead a court issued an order or didn't okay just the
facts from the fact you don't know who listens to whom
you don't know that the condemned person
listen to the bullets that the bullets are listening to the
captain okay all we hear is one command two shots dead okay a
triple of variable yes no yes no okay when that you can learn who listens to
whom and you can answer the question no
definitively no but don't you think you can start proposing ideas for humans to
review you want machine to learn right you
want a robot so robot is watching yeah
trials like that yeah 200 trials and then he has to answer the question what if
rifleman a refrain from shooting yeah so how to do that
that's exactly my point it's looking at the facts don't give you the strings
behind the fact absolutely but do you think of machine learning
as is currently defined as only something that looks at the facts
and tries right now they only look at the fact so is there a way to modify
yeah in your sense playful manipulation playful manipulation
yes doing the interventionist kind of thing yes intervention but it could be
at random for instance the rifleman is sick that day
or he just vomit so whatever so machine can observe this unexpected event
which introduce noise the noise still have to be a random to be able to
relate it to randomize experiment and then you have a
observational studies from which to infer the strings behind the facts
it's doable to a certain extent but now that we
expert in what you can do once you have a model
we can reason back and say what you kind of data you need
to build a model got it so I know you're not a futurist but
are you excited have you when you look back at your life
long for the idea of creating a human level intelligence yeah
I'm driven by that all my life I'm driven just by one thing
but I go slowly I go from what I know to the next step incrementally so without
imagining what the end goal looks like do you imagine
what the end goal is going to be a machine
that can answer sophisticated questions counterfactuals of regret
compassion and responsibility and free will
so what is a good test is a touring test
a reasonable test free will doesn't exist yet
there's no how would you test free will and that's so far we know only one
thing I mean if robots can communicate
with reward and punishment among themselves and
hitting each other on the wrist and say you shouldn't have done that
okay playing better soccer because they can do that
what do you mean because they can do that because they can communicate among
themselves because of the communication they can do because they
communicate like us reward and punishment yes
you didn't pass the ball the right the right time
and so therefore you're going to sit on the bench for the next two
if they start communicating like that the question is will they play better
soccer as opposed to what as opposed to what they do now
without this ability to reason about reward and punishment
responsibility and I can only think about
communication communication is and in not necessarily natural language but just
communication just communication and that's important to have a quick
and effective means of communicating knowledge
if the coach tells you you should have passed the ball pink
he conveys so much knowledge to you as opposed to what
go down and change your software right that's the alternative
but the coach doesn't know your software so how can the coach tell you
you should have passed the ball but that our language is very effective you
should have passed the ball you know your software
you tweak the right module okay and next time you don't do it
now that's for playing soccer or the rules are well defined
no no no they're not well defined when you should pass the ball is not well
defined no it's a it's very soft very noisy yes
you have to do it under pressure it's art but uh in terms of aligning
values between computers and humans
do you think this cause and effect uh type of thinking is important to align
the values values morals ethics under which the
machines make decisions is is the cause effect where
the two can come together cause effect is necessary component
to build a ethical machine because the machine has to emphasize
to understand what's good for you to build a model of just
of you as a recipient which should be very much what what is compassion
they imagine that you suffer pain as much as me as much as me
i do have already a model of myself right so it's very easy for me to map you
to mine i don't have to rebuild a model it's much easier to say oh you're like
me okay therefore i would not hate you
and the machine has to imagine it has to try to fake to be human essentially so
you can imagine that you're that you're like me
right and well who is me that's the fact that that's consciousness
they have a model of yourself where do you get this model you look at yourself
as if you are a part of the environment if you build a model of yourself versus
the environment then you can say i need to have a model of myself i have
abilities i have desires and so forth okay i have a blueprint of my software
not a full detail because i cannot get the halting problem
but i have a blueprint so on that level of a blueprint i can modify things i can
look at myself in the mirror and say if i change this more tweak this model
i'm going to perform differently that is what we mean by free will
and consciousness or just what do you think is consciousness
is it simply self-awareness so including yourself into the model of the world
that's right that some people tell me no this is only part of consciousness and
then they start telling me what they really mean by consciousness and i lose them
yeah for me consciousness is having a blueprint of your software
do you have concerns about the future of ai all the different
trajectories of all of our research yes where's your hope
where the movement has where your concerns i'm concerned
because i know we are building a new species
that has the capability of exceeding our exceeding us
exceeding our capabilities and can breathe itself
and take over the world absolutely it's a new species it is uncontrolled
we don't know the degree to which we control it we don't even understand
what it means to be able to control this new species
so i'm concerned i don't have anything to add to that because
it's such a gray area that unknown it never happened in history
yeah okay the only the only time it happened in history
was evolution with human being right it wasn't very successful was it
some people says it was a great success for us it was but
a few people along the way uh our few creatures along the way would not agree
so uh so it's just because it's such a gray area there's nothing else to say
we have a sample of one sample of one that's us
but some people would look at you and say yeah but we were looking to you
to help us make sure that sample two works out okay actually we have more
than a sample of more we have theories theories
and that's good we don't need to be suggestions
so sample of one doesn't mean any poverty of knowledge it's not
sample of one plus theory conjectural theory
of what could happen yeah that we do have
but i i really feel helpless in contributing to this argument because i
know so little and and my imagination is limited
and i know how much i don't know and
i but i'm concerned you're born and raised in israel
born and raised in israel yes and uh later served in
Israel military defense forces in the in the Israel defense force
yeah yeah what did you learn from that experience
for my experience there's a kibbutz in there as well
yes because i was in the nachal which is a
combination of agricultural work and military service
we were supposed i was really idealist i wanted to
be a member of the kibbutz throughout my life
and to live a communal life and uh
so i prepared myself for that uh slowly slowly i
wanted a greater challenge so that's a that's a far world away
both but i learned from that what i can either
it was a miracle it was a miracle that i served in the 1950s
i i don't know how we survived the country was under austerity
it tripled its population from 600 000 to a million point eight
when i finished college no one who went hungry
austerity yes when you wanted to buy uh to make an omelette
in a restaurant you had to bring your own egg
and the
imprisoned people from bringing food from the from farming from the
villages to the city but no one went hungry
and i always add to it and
higher education did not suffer any budget cut
they still invested in me in my wife and our generation to get the best
education that they could okay so i'm really
grateful for the opportunity and i'm trying to pay back now
okay it's a miracle that we survived the war of 1948
they were so close to a second genocide
it was all in plant but we survived it by miracle
and then the second miracle that not many people talk about
the next phase how no one went hungry
and the country managed to triple its population
you know what it means to people imagine united states
going from what 350 million to
yeah and believe this is a really tense part of the world it's a complicated
part of the world israel and all around
the religion is is um at the core of that complexity
one of the components religion is a strong motivating course for many many
people in the middle east yes in your view
looking back is religion good for society
that's a good question for robotic you know
there's echoes of that question equip robot with religious belief
suppose we we find out or we agree that religion is good to you to keep you in
in line okay should be given about the the
metaphor of a guard as a matter of fact the robot will get it
without us also why but the robot will reason by metaphor
and what is the the most primitive metaphor
a child grows with mother smile father teaching
father image and mother that's god so you want it or not the robot will
but assuming assuming that the robot is going to have a mother and a father
it may only have a programmer which doesn't supply warmth
and discipline yeah or discipline it does so the robot will have this a model
of the trainer and everything that happens in the
world cosmology and so is going to be mapped into
the programmer it's god man the the thing that represents
the origin of everything for that robot the most primitive relationship
so it's going to arrive there by metaphor and so the the question is if
overall that metaphor has served us well as humans i really don't know
i think it did but as long as you keep in mind it's only a metaphor
so if you think we can can we talk about your son
yes yes can you tell his story a story
Daniel the story is known he was abducted in Pakistan
by al-qaeda driven sect and
under various pretences i don't even pay attention to what the pretense will
originally they wanted to have to have united states
deliver some promised airplanes there it was all made up and all this
demands were bogus i don't know really but
eventually he was executed in front of a camera
at the core of that is hate and intolerance
the core yes absolutely yes we don't really appreciate
the depth of the hate it's which
which billions of peoples are educated
we don't understand it i just listened recently
to what they teach you in Mogadishu
when when the water stopped in the tap
we knew exactly who did it the jews the jews
we didn't know how but we knew who did it
we don't appreciate what it means to us the depth is unbelievable do you think
all of us are capable of evil
and the education the indoctrination is really what creates
absolutely we are capable of evil if you are indoctrinated sufficiently
long and in-depth we are capable of ices we are capable of
Nazism yes we are but the question is whether we after we have
gone through some western education and we learned that everything is really
relative it is no absolute god it's only a belief
in god whether we are capable now of being transformed
under certain circumstances to become brutal yeah
that is where i'm worried about it because some people say yes
given the right circumstances given economical
bad economical crisis okay you are capable of doing it too
that worries me i i want to believe it i'm not capable
this is seven years after Daniel's death he wrote an article
at the wall street journal titled daniel perl in the normalization of evil
yes what was your message a message back then
and how did it change today over over the years
i i lost what was the message the message was that
we are not treating terrorism as a taboo
we are treating it as a bargaining device
that is accepted people have grievance and they go and
bomb restaurants okay it's normal look you're even not
not surprised when i tell you that you know 20 years ago you say what
for grievance you go and blow a restaurant
today it's becoming normalized the banalization of evil
and we have created that to ourselves by normalizing by
by making it part of
political life it's a political debate every
terrorist yesterday becomes a freedom fighter today and tomorrow it's
becoming terrorist again it's switchable
all right and so we should call out evil when there's evil
if we don't want to be part of it be coming if you want
yeah if we want to separate good from evil that's one of the first thing that
what would they in the garden of Eden remember the first thing
that god tells him was hey you want some knowledge
here's a tree of good and evil so this evil touched your life personally
does your heart have anger sadness or is it hope
okay i see some beautiful people coming from
pakistan i see beautiful people everywhere but i see
horrible propagation of evil in this country too
it shows you how populistic slogans can catch the mind of the best
intellectuals today's father's day i didn't know that
yeah i heard it what's uh what's what's uh
fond memory you have of daniel oh many good memories
immense he was my mentor
he had
a sense of balance that i didn't have
yeah he saw the beauty in every person
he was not as emotional as i am more
looking at things in perspective he really liked
every person he really grew up with the idea that a foreigner
is a reason for curiosity not for fear
at one time we went in berkeley and a homeless came out from some dark alley
and said hey man can you spare a dime i retreated back you know two feet back
and then he just hugged him and said here's a dime
enjoy yourself maybe you want some
um money to take a bath or whatever where did he get it not for me
do you have advice for young minds today dreaming about
creating as you have dreamt creating intelligent systems
what is the best way to arrive at new breakthrough ideas and carry them through
the fire of criticism and and past conventional ideas
ask your questions
freely your questions are never dumb
and solve them your own way okay and don't take no for an answer
look if they are really dumb you will find out quickly
by trying an arrow to see that they are not leading any place
but follow them and try to understand things your way
that is my advice i don't know if it's going to help anyone
now that's brilliantly there is a lot of
it's the inertia in science in academia it is slowing down science
yeah those two words your way that's a powerful thing
it's against inertia potentially against the flow against your professor
against your professor i wrote the book of why
in order to democratize common sense
in order to instill rebellious spirit in students
so they wouldn't wait for until the professor
get things right
as you wrote the manifesto of the rebellion
against the professor against the professor yes
so looking back at your life of research what ideas do you hope ripple through
the next many decades what what do you hope your legacy
will be i already have a tombstone
carved
oh boy the fundamental law of counterfactuals that's what
it's a simple equation what a counterfactual in terms of a model
surgery that's it because everything follows from that
if you get that all the rest
i can die in peace and my student can derive all my knowledge
my mathematical means the rest follows yeah
yeah thank you so much for talking today i really appreciate it thank you for
being so attentive and uh instigating we did it
we did the coffee helped thanks for listening to this conversation
with yadea pearl and thank you to our presenting sponsor
cash app download it use code lex podcast
you'll get ten dollars and ten dollars will go to first
a STEM education nonprofit that inspires hundreds of thousands of young minds
to learn and to dream of engineering our future
if you enjoy this podcast subscribe on youtube give it five stars on apple
podcast support on patreon or simply connect with me
on twitter and now let me leave you with some words of wisdom
from judea pearl you cannot answer a question that you cannot ask
and you cannot ask a question you have no words for
thank you for listening and hope to see you next time