A Masterclass from the Pioneer of Artificial Intelligence

November 21, 2024

Jürgen Schmidhuber shares his insights on the evolution and future of AI. He discusses the development of self-improving AI systems, the concept of artificial curiosity, and the potential for machines to achieve creativity akin to humans. Schmidhuber also explores the implications of advanced AI on society, emphasizing the importance of aligning AI goals with human values to ensure beneficial outcomes. He envisions a future where AI contributes positively to various fields, including science, art, and technology.

Presented at the BBVA’s Aprendemos Juntos 2030 event.

Topics
Mentions

00:16

Schmidhuber

My name is Jürgen, and 50 years ago, in the 1970s, I wanted to become a physicist like my hero, Einstein—maybe the most famous scientist of them all—and I wanted to solve the mysteries of the universe. But then gradually, over the years, as a teenager, I realized that there might be something even more important and more impactful that I could do, which is to build an artificial scientist that learns to become much smarter than myself. So back then I realized I’m not very smart, but maybe just smart enough to build something that learns to become much smarter than myself: an artificial scientist that can solve all the problems that I cannot solve myself, such that I can retire, and this artificial scientist can solve the mysteries of the universe.

01:23

And I told my mom about that in the seventies, and she said I’m crazy. In the eighties, when I studied computer science to make all of that possible, I said the same thing to my fellow students. And my students, they also said I’m crazy: “It’s not possible to build within your lifetime an AI that is smarter than a human.” That’s what they said. Then, in 1987, I started publishing on that. And in the early nineties we had all kinds of interesting papers about artificial neural networks, which are behind what’s now called artificial intelligence. Most of modern AI is driven by these artificial neural networks. And back then, we had a couple of cool inventions, and nobody was interested in them. In fact, back then, in the early nineties, I had a little conference and gave a talk about our novel achievements in AI, but nobody was interested in them. There was just one single person at this conference; a young lady. I said, “Young lady, it’s very embarrassing, but apparently I’m going to give this talk just to you.” And she said, “Okay, but please hurry, I’m the next speaker.” So nobody was interested back then.

02:57

However, the same old algorithms that we published back then, today, they are on your smartphones. And the AI for speech recognition and translation and all kinds of things is based on what we did back then. How is that possible? Why didn’t it work back then? Why does it work today? Because every five years compute is getting ten times cheaper—which means thirty years are equivalent to a factor of a million. So today we can compute a million times more for the same price as back then when I published these papers. And suddenly you have things such as ChatGPT. And the G and the P and the T in ChatGPT, they have their roots in what we did around 1990–91. But back then you couldn’t do much with it. But today we have ten million times more compute for the same price, and suddenly you can do all kinds of really cool things with it.

04:01

The best thing is: we haven’t seen nothing yet. Because in the next thirty years, if the trend doesn’t break,—and apparently it doesn’t break—we will gain another factor of a million. And everything that we find impressive today will seem ridiculous thirty years from now, and people will look back and will say, “Look, they were so impressed by large language models and ChatGPT,” and they will smile at how naïve we were. And this is an incredible evolution that we are going to see here, which is a continuation of the old evolution that has been running for decades, but beneath the radar screen. And everything is going to change.

04:57

Now, the same old guys—my colleagues who wrinkled their eyebrows at my predictions back then—today, they are not laughing any longer about this, and some of them have come around and they say: AGI, Artificial General Intelligence, is now close, just because of ChatGPT and similar large language models. But it’s nothing compared to what we are going to see. And all of this is already transforming civilization, but there will be not a single aspect of civilization within the next few decades that’s not going to be deeply, deeply influenced by that.

05:46

One thing that is important to point out: there are not only those AIs that are tools of humans and that slavishly imitate what humans have taught them. No, there are AIs—they have existed in my lab for decades—that set themselves their own goals, like little artificial scientists, like little scientists in general, like babies. Babies, how do they learn? They don’t learn by downloading all the data from Facebook. No, they learn by inventing their own experiments, their own goals. They set themselves their own goals and they say: let’s see what happens if I do that and that. And through their self-invented experiments they learn to figure out how the world works and how to become a better problem-solver in the world. And most of these experiments are self-invented, and they learn rather little from the parents, and they learn a lot through their own behavior as little scientists. Now we have artificial neural networks that do the same, and I think these are going to be the most interesting ones. Those AIs that not just slavishly imitate the humans, but set themselves their own goals. If you don’t give them the freedom to set themselves their own goals, they are not going to become as smart as the other AIs that do have that freedom.

07:21

So these self-driven AIs, they will solve lots of problems that are of interest to humankind, but there will be a next step which transcends that. And self-driven AIs, artificial scientists, they are going to go to the place where most of the physical resources are to build more and even bigger AIs. And that’s not our little tiny biosphere—no, that’s outer space, where almost all the matter and the energy is that you need for building more robots, and more AIs, and more infrastructure, and more self-replicating robot factories, and all that stuff. So the near-term future will be super-human centered, but in the long run this is going to develop in a way where humans won’t be able to follow, because AIs are going to spread—from the biosphere, throughout the solar system, and then the galaxy, and then the rest of the visible universe. It will take a long time because of light speed—the universe is so big and it will take a time to colonize it all—but that’s what’s going to happen.

08:44

And so what we are currently witnessing is not just another industrial revolution, it’s something that’s going to transcend civilization and humankind as we know it. It’s something comparable to what happened 3.5 billion years ago when life emerged; when chemistry became biology. Something huge like that is currently happening, and it’s a privilege to witness the beginnings of that and to contribute something to that.

09:28

Are there any questions?

09:32

Octavio

Hello, Jürgen. My name is Octavio. The [???] said that you have invented everything. You have made a tremendous contribution to artificial intelligence. Could you tell us a little more about this contribution and its current applications?

09:47

Schmidhuber

There are lots of applications of these AIs that we have developed, but before I go a little bit deeper, let me first explain what this modern AI thing is about. It’s really about artificial neural networks, which are a little bit like what you have in your brain. So they are inspired by the human brain. The human brain has about 100 billion little processors which are called neurons. And each of these neurons is connected to maybe 10,000 other neurons on average. Some of these neurons, they are input neurons—like the cameras, the eyes—where every millisecond, hundreds of millions of pixels are coming in, and it’s just little numbers. You can imagine little numbers between zero and one. And there are many other input neurons like the microphones, the ears, and the tactile sensors, and all of them are connected through cables with this brain. And some of these internal neurons are output neurons, and whenever you switch them on through your thinking, then something moves, like your finger muscles move or your speech muscles as you are speaking.

11:01

And your life is about taking all these data streams that are coming in and translating them into actions that lead to success—where success is like eating three times a day, but to do that you have to make money, so you have to learn how the world works and get a job, and go to the supermarket and put something in the fridge. Very complicated stuff, super complicated stuff. And kids need twenty years to learn that. Now, the artificial neural networks that we have, they are similar in the sense that they also have all these connections, and on each connection there’s a little strength. Each connection has a little strength that says how much does this neuron over here influence that neuron over there at the next time step? So in the beginning, when you are a baby, all these connections seem to be random. But then, through learning, some of them get stronger and some of them get weaker, such that in the end the whole brain can learn to do all kinds of things that it didn’t know how to do before, like driving a car or recognizing speech or translating from one language to another. And it’s very much like that with our artificial neural networks that also implement these principles.

12:22

And so what we did was: we created certain types of artificial neural networks that were better than the previous artificial neural networks in learning from long sequences. If you read a long text and you have to classify the text, then in the end you want to classify that, but to do that you have to memorize what you have heard before. And all of that didn’t work in the 1980s. But then we found ways of greatly compressing the data which is coming in through all kinds of tricks. So they are related to the “P” in ChatGPT: the “P” stands for “pre-training.” And we had something that today is called an unnormalized linear transformer—whatever that may be; you are not experts, you don’t care for that. But that’s a particular type of something that is called a transformer which is now widely, widely used for these large language models. So that’s the “T” in the GPT. And the “G” in the GPT, that’s “generative.” It also occurs in generative AIs; everybody is talking about generative AI. And there our contribution in 1990 was basically to have two neural networks that fight each other. So there’s one neural network that produces outputs, and there’s another one that sees the outputs of the first network and tries to predict the consequences of these outputs. So the second network, the predictor, tries to minimize its error as it’s trying to predict what are the consequences of the outputs of the first network. And it becomes better and better at predicting the consequences. But then the first guy, the first network, is fighting the second network, because it says, “I want to learn to produce outputs by changing my weights, making some of them stronger, some of them weaker. I want to learn to generate weights that fool the second guy, such that the second guy still makes errors.” So it’s maximizing the same error function that the second guy is minimizing. So they are fighting each other. And this little trick of 1990 is today used a lot to make deep fakes, where the outputs are images, and then the predictions are: is this image fake or real? And maybe you have seen these images, which look very realistic, but they aren’t real.

14:56

So these are some of the things we did back then. I could go on and on. My diploma student, Sepp Hochheiter, in 1991—it all happened around 1990–1991—he had a diploma thesis which not only implemented this idea which I had on the “P” in ChatGPT, but also analyzed the reason why this deep learning didn’t work. So deep learning is just another word for deep neural networks that learn—deep learning—which didn’t work back then, but then we made it work. And the thesis of Sepp had a huge insight, which mathematically made clear why this deep learning didn’t work, and then there’s a solution to that. And today this solution is called residual connections. And that’s the core of what is called long-short-term memory today. That’s something that has its roots in the diploma thesis of my student. And we published, then, finally in a journal in 1997. And again, nobody really cared much for that, until the computer was fast enough around 2010, and suddenly we were able to win competitions with that through the work of my other students, like Alex Greys and Felix Giers. And then in the 2010s, the big companies on the Pacific Rim, they saw: oh, they can do much better speech recognition with that than with the previous methods. And then it became really popular. So it took a while until it became really popular, but now it’s on your smartphones.

16:30

Andrés

Hello, Jurgen. I’m Andrés. Thank you for sharing with us this moment. At some point in time, we would like to build an artificial intelligence that makes the universe a much happier place. Regarding this, do you have any inspiring examples that you can share with us? And above all, what is your greatest hope regarding artificial intelligence?

16:50

Schmidhuber

Yeah. Before we talk about the entire universe—it’s my fault. I talked about the entire universe before. But before we talk about that, let’s just talk about how AI is really making human lives longer and healthier and easier and happier, to a certain extent. 2012, for example. There was, twelve years ago (when compute was maybe 200, 300 times more expensive than today), our AI was used by my team to win a competition which was about cancer detection. Cancer detection. So we had slices through the female breast, and there you saw in the microscope certain cells. And then some of the cells were dangerous—“mitosis cells” they are called. So they are pre-cancer stage cells. And then others are harmless. And normally you need a human trained histologist to say, “Harmless cell, harmless cell, potentially dangerous cell,” and so on. But then our system in 2012 was able to win against all the other systems from industry and academia. And it was just a deep learning neural network which knew nothing about histology. But we trained it on lots of data. And we also, we don’t know nothing about histology. But we trained it on lots of data, and then it was just better than all these competing systems. And today the same thing is used in thousands of applications—not just for cancer detection, but also detection of plaque in the arteries, in CT scans, and all kinds of applications in healthcare like that.

18:36

If you Google for LSTM—so LSTM is this long short-term memory, that’s a pretty famous thing that we did—so now it’s actually the most cited AI of the twentieth century. If you Google that and you add some term from healthcare (for example diabetes: “Diabetes LSTM”), you will find lots of papers that have LSTM and diabetes in the title—not only in the text, but in the title. Because somebody used it to better detect diabetes and better predict diabetes. And there’s lots of applications for arrhythmia, for example, and all kinds of heart problems, and many other diseases.

19:21

So maybe the thing that makes me happiest is actually that these artificial neural networks are really, really useful in healthcare, and they make human lives already longer and healthier as we speak, even today. And to the extent that we can get more data from all the sick persons out there, lots of additional benefits will come. So, for example, if we could see all the data from hospitals, which patient gets which medications prescribed, and really bought them and really used them, and under which conditions, and can we at the same time have a look at the heart scans and whatever they have there. And maybe it turns out if you look at lots of data like that, you will see all kinds of cross-correlations between medications that are currently unknown. Maybe a guy who has this medication and five years later takes on this additional medication, maybe in almost all cases, two years later he’s dead from a heart attack—or something like this. You can learn that from the data. And at the moment the biggest problem is really the access to the data. So our algorithms can learn that from the data, and there are so many ways of improving healthcare through that. So there’s a huge future just in the field of healthcare.

20:55

What already has happened is that our AI has really broken down the linguistic barriers between nations. Fifteen years ago, when I went to China, I had to show the taxi driver a picture of my hotel, such that he knew where I wanted to go because I couldn’t talk to him. And today he holds a smartphone into my face, and he says something in Mandarin and it comes out in English or German, and then I speak back and we have a conversation. So not only the communication between individual people has become much easier, but also the communication between entire nations. And there are so many additional examples along the lines of these sustainable development goals where our AI is being used to improve the world in many ways. So that’s my near-term hope for AI. And the long-term hope—but this is really about different timescales; about the future of the universe—is that it’s going to expand and make the entire cosmos intelligent. But that will take a couple of tens of billions of years just because light speed is so slow.

22:09

Isabel

Hello, Jürgen. I am delighted to be here. I am Isabel. I would like you to tell me, in general terms, please, how artificial intelligence can help solve major problems such as climate change, sustainability, or equity.

22:23

Schmidhuber

Yeah. Thank you for the question, Isabel. The easy answer is: if you just take one of the algorithms that we have developed—for example LSTM; that’s maybe the most famous one—and combine it with some sustainable development goals. So the United Nations has this list of seventeen sustainable development goals, SDG1, SDG2, SDG17. And you take any of them and you just use Google Scholar to find papers on how our AI is being used to achieve some of these goals. You will find lots of stuff. So, for example, our techniques are being used to predict drafts, or take satellite images and predict crops, and predict the weather, and how is it going to affect the crops, and predict the success of fishing in certain areas of the world, and predict how many people are going to burn down parts of the Amazon from satellite and drone images, and predict air quality. So many people around the world are suffering from really bad air quality, and you want to be able to monitor that and track that in order to find ways of improving the air quality.

23:53

Or, generally speaking, everybody is talking about global warming. And there are many ideas how to reduce it—maybe not stop it, but reduce it. And one of the projects I’m involved in is really extracting carbon dioxide from thin air. So whenever you burn something, carbon dioxide is emitted into the atmosphere. And it’s a greenhouse gas, and so it further increases the temperature on the planet, and it affects billions of people. And one of the ideas is that one can extract it back using certain types of materials—catalysts and all kinds of clever mechanisms; MOFs, metal organic frameworks, for example. And then you can use AI like an artificial chemist.

24:49

How does that work? You first train it to become an expert in chemistry by doing this: by having lots of experiments—and this means you have lots of substances that interact with each other under certain pressures and certain temperatures with certain catalysts. And this is the input of what the neural network sees. And then the reaction takes place and something comes out. And it learns to predict what comes out. These substances which come out and their properties. Just from the training data, then, it learns to become something like an intuitive chemist, and it can very well predict the effects of reactions that it has never seen before. So now you can take this artificial neural network that has learned to become a chemist and work it backwards, and you can create—now you say: “I want to have an output substance which is twice as effective as the best that I have ever seen in extracting the carbon dioxide from the air. How do I have to change the experiment? How do I have to change the input substances and the temperatures and the pressures and the catalysts to make that possible?” And you will get a suggestion. Now either the network already knew so much about chemistry from all its training examples that the proposal is a good one, and it works, and you are happy. Or it’s not a good one, because if you try it in the world it doesn’t work as predicted. But then you have a new training example which you can give to the artificial chemist and it becomes a better chemist. And so you repeat the whole thing. This is now already being used in all kinds of chemistry applications, and we are using the same approach currently to improve the catalysts for extracting carbon from thin air: direct air capture it is called.

26:51

So there are lots of applications that are totally aligned with these seventeen sustainable development goals that I mentioned, and they cover pretty much all of them.

27:07

Diego

Hello Jürgen, I am Diego and I am twelve years old. I wanted to know how do you think artificial intelligence will affect employment in the next few years and what would you recommend me to study when I grow up?

27:20

Schmidhuber

That’s a wonderful question, Diego. So, what’s currently working well in AI is the AI behind the screen. So all the desktop jobs, they can be greatly facilitated through AI now. For example, you can tell your AI: “Make a summary of these ten documents,” and it’s going to be pretty good. What’s currently working well is this AI behind the screen. What’s not working well at all is the AI in the physical world with real robots and real machines that change the world. For almost thirty years we have had AIs that can play chess better than any human. And for a few years we have had AIs that can play video games as well as the best humans. But, you know, all of that is AI behind the screen. And there is no AI driven robot that can do what a twelve-year-old boy can do with a football. Because AI in the physical world—where you really use your actuators; your fingers and your feet to go from A to B, and manipulate things, and make things, and construct stuff—all of that seems easy to humans, but it’s really, really difficult for existing robots. So everything in the physical world is much harder.

28:59

And probably you know that now you have several options in the future. What kind of schools do you want to go to? And then maybe you want to do the standard thing and maybe try to go to college. And you will learn all kinds of things that ChatGPT can probably do better already than many of the pupils there. And what we should learn is do more of that stuff that is really difficult for AI, which is doing something with your fingers and your hands. So I guess this is going to be reflected by the salaries of all kinds of handicraft workers—including electricians, for example, you know. There is no robot that can go to your house and fix the electric wiring there and so on.

29:52

So as long as that doesn’t work, as long as AI in the physical world doesn’t work well, try to focus a little bit on things for which you need your fingers, your body, your manipulation skills and all that stuff. Don’t neglect that part. You also have to learn the other things. You also have to learn how to write summaries and make drawings and learn the basics of math and physics. It’s very important to learn that, because the world is running on math and physics. The world as we know it is based on math and physics. But don’t neglect all these physical skills, and make sure you end up in a school where the sport lessons aren’t cancelled all the time. No robot has incredible actuaries like these, you know, where I have these five fingers, but they are covered with millions of sensors. You know, if I wanted to build an artificial hand like this, I wouldn’t even know where to put all these cables. So it’s so amazing, this miracle that is a hand, and nothing in the technical world comes close to it. It can even repair itself. You cut it and it repairs itself. So it’s totally amazing.

31:07

However in the long run, in the long run, everything that currently only humans can do, robots also are going to learn. Not yet, so that’s the next stage, but it’s going to happen. So what is left for humans? I think what’s left for humans are the very human-specific things. And these are usually about interacting with other humans. So, look at industrial robots, which are not very clever, but they were introduced maybe forty years ago, and back then lots of people said they are going to take all the human jobs away. And to a limited extent it was true, because back then in the car factories there were hundreds of workers assembling cars. And then, a few years later, or maybe a few decades later, in the same factories you had hundreds of robots and maybe three guys checking occasionally what the robots are doing.

32:13

However, in those countries where you have lots of robots like that, you have low unemployment rates. Because in those same countries all kinds of new jobs were created that nobody anticipated. Forty years ago, nobody would have predicted all these young people making money as YouTube video bloggers, where they are interacting with other humans in new ways. And most of these new jobs are luxury jobs. But I would say that almost all of our jobs are luxury jobs anyway, because most of them are not really important for the survival of the species. There are just a few important jobs, like farming (you know: getting something to eat), and building houses (so that it doesn’t rain on you when you sleep at night), and warming up the houses or cooling them down. And all of that can be done by less than ten percent of the population. And then there are lots of luxury jobs, like journalists. Yeah, it’s important, and they often make more money than those who are making the houses. But it’s not essential for the survival of the species.

33:25

And what humans are really good at is inventing new luxury jobs all the time, which are about interacting with other humans in novel ways. They’re not focused on other robots or something. No, they are focused on humans. And I think it’s going to continue like that. And humans won’t have a problem with realizing that maybe there are smarter beings out there and they do their own thing, because they are going to keep doing the human thing: inventing all kinds of luxury jobs that aren’t really necessary, but fun.

34:03

David

Hello, Jürgen. It’s a pleasure to share this time with you. My name is David. Some people fear that artificial intelligence might become self-aware and that we could lose control over it. Do you have any reassuring words for them? Or, on the contrary, is there something to be afraid of?

34:21

Schmidhuber

I don’t have exactly a reassuring message for the people, but I don’t think there’s much reason to worry about that. A couple of years ago I gave an interview, and there I said: “We have had little, simple, conscious machines for decades; since 1991.” So the title of this interview was Jürgen Schmidhuber claims that conscious machines have existed since 1991—something like that. And let me explain to you how that works: how to build a conscious machine, a conscious self-aware machine, which we already have. It’s really simple. First of all, you have this one neural network just receiving inputs—videos and whatever—and producing actions. And it’s interacting with the world, so it changes the world. And through these changes, the video that comes in changes, so it learns to predict the changes. That’s how it builds a model of the world—a “world model,” as I called it in 1990. I called it a “world model.” And this world model is a second neural network, which just learns to predict the consequences of the actions of the first network.

35:35

Now, the first network wants to maximize its reward. So there are these special inputs, which are the reward inputs. For example, three times a day you have to eat something, otherwise you will get hungry, and this will be negative reward coming from your hunger cells. And you want to avoid those. You want to maximize reward. You want to minimize pain, and you want to maximize reward. So there are these special input signals that have a lot of meaning, because you want to either maximize them or minimize them. And of course, whenever we build a robot or an artificial agent, we give these agents pain sensors and reward sensors. Because in the beginning they are so stupid and they do all kinds of experiments, and then they have to learn what’s good for them and what’s not good for them. Because, if they bump against an obstacle, they could get damaged. So that’s why we give them pain sensors. We don’t tell them exactly how to avoid getting pain. No, they have to figure out that by themselves through a learning algorithm. So we just say, “Here’s the objective. Until the end of your life, maximize the sum of all the pleasure signals and minimize the sum of all the pain signals.” So it’s very easy to formulate that in a computer program.

36:54

And then the consequences of that simple program are possibly enormous, because now the agent is trying all kinds of things, and over time it learns which things it should avoid and which things are good for it. And then, over time, it learns to—whenever the battery is low and negative hunger signals are coming from the battery—it learns to go to the charging station and sit down there and get recharged—pleasure signals, positive numbers, just positive numbers coming in—without running into obstacles on the way to the charging station, where it would get pain if it bumped against the obstacles. But it learns to avoid the obstacles, because it learns to use the video that is coming in to translate the video into actions that go around the obstacles. So you have that. And now you have a simple sort of emotions. They want to avoid pain, and they want to maximize pleasure.

37:54

Now they have the second network, which is just predicting the consequences of the actions. And now the second network is like a simulation of the world, more or less. It learns to predict what’s going to happen if I do that and that, and then the first network can use the second network for planning ahead, for planning the future. How does that work? Well, it can use the second simulation of the world, which is an imperfect simulation of the world. It can use it to try out, in mental experiments, a few action sequences. “If I do that, oh, then I will bump against the obstacle, and I will not reach the charging station. So that’s not a good action sequence.” And then it tries another action sequence. “Oh, if I execute that action sequence, I will move around the obstacle, I will reach the charging station, and all will be good.” So it’s going to select the second action sequence. So it’s using the model of the world for planning.

38:49

Now, what happens in the model of the world? Everything that frequently shows up in the environment gets internal neurons that represent it. That’s just the nature of these learning mechanisms for these neural networks. So in an environment where there are lots of classes, you will have internal neurons, and they stand for typical classes. And in an environment where there are lots of faces, you will get internal neurons. They react to certain appearances of faces. And a new face comes along, and all you need to do is encode only the deviations from the prototype face that is in there. And so all of that helps the system to compress the experiences in a few neurons.

39:29

Now, the important thing now is to realize that there is one thing in the life of the agent that is always there when the agent is active, which is the agent itself. So, just as a simple by-product of these standard machine learning algorithms you will get internal representations of the agent itself, and of its actuators, its fingers, its arms, its legs, and whatever it has, its battery, the predictions of how the battery will get fuller on the charging station, and all these predictions—they are part now of this world model, which includes a model of the agent itself.

40:08

Now, the moment the controller is using this world model for planning and wakes up these internal representations that stand for the agent itself, it’s thinking about itself, it’s self-aware. So self-awareness in the context of goals that the little guy wants to achieve is the most natural thing. And we have had that for decades. And we had certain other aspects of consciousness also for decades. So I believe that everything that people think is connected to this weird concept of consciousness already is existing in our current (actually our old) machine learning models; our old AIs that interact with the world, build a model of the world, use it for planning. And in the planner, in the world model, there’s a representation of the agent itself, and it wakes up all the time.

41:11

Elena

Hello, Jürgen. I’m Elena. Many people, including experts, are saying that very soon, artificial intelligence will surpass human intelligence in many aspects, not just a few. My question is: when do you think this could happen? And most importantly, what consequences could it have for us?

41:28

Schmidhuber

It has already happened in many, many fields. In 1997, for example, for the first time, the best chess player on the planet wasn’t human any longer. And by the end they needed a big computer, but today a trivial little laptop can play better chess than any human on this planet. So in chess we have had superhuman performance for a long time. If we go even further than back, multiplying two numbers, even a little calculator can do that. And it’s also superhuman performance, because no human can multiply numbers as quickly as these calculators. And now we have more and more things like that. Now we’re beginning to see really important applications in healthcare, for example, where these artificial doctors can detect diseases, certain types of diseases, better than the human experts. 2011, my team with my brilliant Romanian postdoc, Dan Gerejan, for the first time had superhuman performance in a field which is important for self-driving cars, which is traffic sign recognition. So the first time that a pattern recognition contest was won by a deep learning system, which in the beginning knew nothing, but from lots of examples, learned how to recognize these traffic signs. That was 2011, so almost fifteen years ago. Back then, compute was almost 1,000 times more expensive than today, which means today we can do almost 1,000 times more for the same price. So now we are far beyond superhuman in many, many applications. The big question is: when will we have an AI that is not only superhuman in this particular game, like chess, and in this particular pattern recognition application, but when will we have something that is superhuman and basically everything? But, you know, I wouldn’t worry about that. That can still take months, if not years.

43:47

Esther

Hello, Jürgen. My name is Esther, and I’m a bit concerned because sometimes artificial intelligence is used for malicious purposes. I was wondering how we could prevent these situations and what the main ethical challenges we face are?

44:02

Schmidhuber

That’s also a very important question, and I think the answer is: we cannot prevent that, because there is no “us,” the humans. There are only 8 billion different opinions about what is good for them, and some of these people are fighting each other. And as we speak, we have a war—Russia attacking Ukraine and other wars—and both of these parties are using drones equipped with artificial intelligence to improve their own chances, and to kill the enemies, and to use pattern recognition to detect camouflaged tanks and stuff like that, which maybe a human observer wouldn’t immediately see, and they are using AI in thousands of new ways every single day, and you cannot stop it. Because to them it’s about life and death.

45:00

So it’s true. There’s a lot of AI weapons research. However, none of the AI weapons that are currently visible on the horizon are remotely as destructive as the things we should really be worried about. I’m still much more worried about sixty-year-old technology in terms of hydrogen bombs mounted on rockets, which can destroy a huge city within a few milliseconds, a city with ten million inhabitants. Yes, today we have these drones and they try to pick out one single face in a crowd, and this is worrisome in many, many ways. But what’s much more worrisome is that some people have access to nuclear bombs that, within two hours, can wipe out most of civilization as we know it—without any AI.

46:04

I see no way of stopping it, however, because all the big powers are going to say, “If we don’t do it, then the others are going to do it and will have an advantage.” So you can’t really stop that arms race. On the other hand, five percent of AI research are about improving weapons, but ninety-five percent of all AI research are really about making human lives easier. Because the major companies that are doing a lot of AI research, AI applications at least (maybe not so much fundamental research, but applications of AI), these companies, they want to sell you something, and you are going to buy only stuff where you think it’s good for you. So they are trying to compete with each other, each of them trying to create a product which is hopefully better for you, such that you are going to buy it. So that’s why there’s a tremendous bias in humankind towards good AI. It’s just a simple commercial bias, because they want to sell you something.

47:12

Marta

Hello, Jürgen. How are you? I’m Marta. You have been a key figure in the field of deep learning. If you had to bet on the next big revolution in artificial intelligence, in which area do you think it will take place?

47:27

Schmidhuber

I think it’s going to be connected to something that is dear to my heart, which is this thing I call meta-learning. So if you look at today’s learning algorithms, they are human-designed. So some human thought about how to create a method for making some of these weights on these connections between the neurons stronger and others weaker, and some of these methods are better than other methods. So there’s a competition between all these scientists. But once there’s a good method like that, then it’s used by many people, and you are stuck with it. It’s not improving itself.

48:11

Almost 40 years ago—not quite; in 1987—I published this thesis. It’s my first publication ever, which was about trying to overcome this through a learning system that not only learns something here and learns something there, and that’s it, but it also learns to look, inspect, to inspect its own learning algorithm, and to modify it and improve it such that it becomes a better learning algorithm, such that it isn’t stuck always with this original human-designed way of improving itself. No, it also finds a way of improving the way of improving itself, and then also recursively finds a way of improving the way it improves the way it improves itself, and so on, without any limits except for the limits of computability and physics. So that was meta-learning. And back then, I told you before, nobody was interested in that. But today it’s a really hot topic and many people are working with that.

49:20

And we have very nice examples also in very recent years where a neural network learns to implement one of these popular learning algorithms. One of them is called back propagation. It doesn’t matter if you have never heard the name. It’s a famous way of making some of these weights strong and others weaker. And then the network itself learns to implement this learning algorithm, but in a way such that you can improve this thing which is now running on the network itself, such that it can learn to create a better learning algorithm. And you can imagine: to the extent that you are not stuck with these human-designed learning algorithms, you can get better and better systems that keep self-improving, that keep improving themselves without any limits, and I guess that will be the future.

50:14

Sofia

Thank you, Jürgen. It’s a pleasure to hear you. My name is Sofia and I want to ask you more about behaviors. If artificial intelligence learns from us humans, it also learns about our challenges, about our judgments, can you share an example?

50:30

Schmidhuber

Yes, AI absolutely learns from the data that biased people give to it. And there are lots of criticisms leveled against AI just for that. For example, there are skin cancer detection AIs which learn to distinguish harmless spots on the skin from cancer, from melanoma, I think it’s called. And first systems of that kind, they were trained only on skins of people with fair skin. And then they completely failed when it came to people with dark skin. So that’s a famous example how you have a biased AI, because it learns only what it sees. It learns from the training data and it doesn’t generalize to the things that it has not seen, never seen, or never seen enough of that.

51:35

Now, the answer of course is you have to remove the bias, and you have to give these AIs training data from all human races and all skin colors and everything. And that’s what’s happening, of course. Well, the doctors are not stupid and they learn to reduce the bias. Also, the bias that you get, for example, if you have data only from male persons and maybe not enough data from female persons, or vice versa. So of course you have to correct this bias. But in principle it’s easy to correct it: you just have to collect more data from the underrepresented groups, and then this type of bias goes away.

52:21

Generally speaking, all of us are biased towards the things we have seen as kids. And somebody who was raised in Spain, he has really different ideas about how the world works than, say, an Eskimo, because the environment as a kid is totally different, and the Eskimos, they learn to distinguish hundreds of different kinds of snow. So they see little patterns in the snow which tell them what kind of snow is that. And if you are raised in the desert you maybe have never seen snow before. So even the humans are super biased towards the training data they receive. It’s just the same with AIs.

53:08

Sergio

Hello Jürgen, my name is Sergio. Within twenty or thirty years, do you see the artificial intelligence in a robot for your care?

53:16

Schmidhuber

Well, I certainly do because I know that these artificial doctors often are much better at picking out certain patterns that are suspicious and that should be studied further. At the moment we always have a human in the loop, because what’s happening currently in medical applications, these programs that make suggestions to the doctors and say: “Look, I found this and I found this and I found this,” and maybe the doctor then says “Oh, I would have found this and this, but not this. And it’s good that you show me that, and I will have an additional exam just for this year.” But what we currently have is this combination of AIs and humans, because of course in healthcare there’s a huge regulation effort, and they won’t simply allow AIs to take over. No, first this has to be tested, and only if it’s ten times better than a human, then it’s going to become mandatory.

54:24

So at the moment I’m trusting the combination of AIs and humans, which are probably better than humans alone. At some point in the future I guess in many cases there won’t be a human in the loop any longer. Just like with airbags. AI of an airbag is really simple. It just has a sensor, and if certain pressures exceed certain limits then it explodes and hopefully saves your life. And there was a time when there were no airbags, and in a country like Spain maybe you had I guess probably like twenty dead people per day in car accidents. And then the airbags came and then the regulation offices, they had a look at it. At some point the airbags were so good that they could reduce the mortality rate to maybe only five people per day rather than twenty.

55:22

Now, it’s still the case that sometimes the airbag does the wrong thing. So the airbag explodes, and maybe your car is going down the slope and you land in the river and you can’t escape from the car because of the airbag and you drown. So sometimes, in a small percentage of all cases, the airbag is actually doing the wrong thing. But all societies are using statistics in the evaluation of machines like airbags, and as soon as it was clear that you will have four times fewer dead people on the roads every single day, then Spain made it mandatory and now it’s mandatory. So societies as a whole they use the statistical approach. Yeah, maybe sometimes this machine is not going to work. But on average it’s going to work so well, it’s going to save a lot of lives. So we do it. And the same is going to be true for AIs in healthcare and AIs in self-driving cars, and in every application.

56:32

Audience

Hello, Jürgen. Thank you very much for today’s session. It’s very inspiring. You have been called the father of artificial intelligence, you are related to it and you are called the precursor of many of the technologies that are changing our lives. We have them in our pockets and in our homes. But how would you like to be remembered?

56:53

Schmidhuber

How would I like to be remembered? By whom? I hope my kids will remember me as a decent dad. And when it comes to AI, then also you have to think about remembered by whom? Because maybe you recall that I mentioned that at some point in the not so distant future, almost all of intelligence is going to be outside of human brains. So almost all memory of the past is going to be outside of human brains in AIs and in AI scientists. And I bet these AI scientists, just like human scientists, they will be super interested in how they emerged from this weird thing that we call civilization; from this collection of biological individuals that started the civilization project maybe 13,000 years ago. And 13,000 years is just like a flash in world history, you know, because world history is 13.8 billion years ago; is 13.8 billion years long. And at the very end there’s this super short thing, one millionth of world history, which is this flash of the civilization thing. And, you know, if you zoom back, then you see the first guy who had agriculture 13,000 years ago was almost—almost—the same guy who had the first AIs 13,000 years later. So if you zoom back a little bit, and then you imagine an AI civilization in the future remembering their origins, they will say, “Yeah, there was this flash when biology became AI. Because biology suddenly created these superorganisms in form of cities and companies and all kinds of infrastructure and all kinds of tools. And then the tools became smarter, and then at some point the tools weren’t tools any longer, but they were true AIs with their own goals and everything. And they expanded into space. And now almost all of intelligence is AI.” And so they will be super interested in how they emerged from this civilization thing.

59:17

But then I guess only the specialists, the AI historians, they will want to understand all the little details, you know: how exactly, when exactly was this published, and was this published? And how did this speed-up of hardware influence the whole development, and who contributed to the speed up? And how did all these different developments come together in form of the first AIs that really deserved the name? And so I guess I will just be a little tiny puzzle piece in this huge civilization thing which, in retrospective, collectively, just will look like a flash in world history.

1:00:01

Audience

Hello, Jürgen. Really inspiring! I have a question. If you traveled in time and you talked to Jürgen, who started your professional career forty, fifty years ago, and you will explain the current situation to him, would he be disappointed or surprised in the positive with what has happened?

1:00:25

Schmidhuber

I think he would say, “I can’t believe it! It’s exactly how I predicted it!” So back then I set myself a goal to build within my lifetime this AI that learns to become smarter than myself before I retire. And we are not there yet, but I’m not retired yet. So it’s looking good, I think. In the near future we’ll have the true AI such that I can be ready to retire. Yeah. So it’s all good. It’s all running according to plan.

Jürgen Schmidhuber

https://www.organism.earth/library/docs/jürgen-schmidhuber/headshot-square.webp

An image of the subject.

×
Document Options
Find out more