AGI History Guide - Understanding AI Evolution...

AGI - Part 1: Before Seeing the Future, Understand the Past

In this first part of my dive into this new El Dorado we call AGI — artificial intelligence supposed to reach the same level as a human being —, I wanted to go back to the basics. Particularly the history of neural networks, which are today at the heart of many consumer AI applications we use: voice assistants, recommendation engines, text or image generation tools… everything starts from there.

A simple example to understand neural networks

I’ve always found this example to be a good way to explain what a neural network is. It has a specific purpose that’s easily understandable.

Imagine a company that has accumulated a lot of data on its former prospects. It would now like to use artificial intelligence to better predict, in advance, whether a new prospect has a chance of becoming a customer.

For this, we can use the available information: the prospect’s business sector, size, revenue, number of meetings held… And since these are old records, we also know whether, in the end, the prospect signed or not.

The neural network will analyze all this data and try to understand what, in general, makes a prospect become a customer. Once the model is trained, we can use it for each new prospect and obtain an estimate of the chances of a sale.

The major milestones in neural network history

Here are the main historical milestones that led us to where we are today:

1943: McCulloch and Pitts imagine for the first time an artificial neuron, laying the theoretical foundations of neural networks.
1950s: Frank Rosenblatt creates the perceptron, the first functional neural network capable of learning.
1970 - 1986: The mathematical foundations of backpropagation, formalized by Seppo Linnainmaa (1970), led to the introduction of the algorithm by Paul Werbos (1974) and its popularization as a key learning method for multilayer neural networks by Rumelhart, Hinton, and Williams (1986).
1989: First large-scale real-world use: the United States Postal Service relies on a neural network to automate handwritten check reading.

There are many other important milestones after these, but these facts mark, in my opinion, the fundamental origins of the technology. And the observation is striking: if we consider the beginning of the current AI era as the public release of ChatGPT 3.0 in 2022, it took nearly 80 years for a fairly clear scientific vision to become a consumer tool, used daily by millions of people.

The limitations of neural network predictions

Some of you will no doubt have noticed: the example of neural network use I presented raises several limitations. Even if the model provides a mathematically rigorous probability, this doesn’t mean the prediction will necessarily come true in reality.

Indeed, the conclusion of a sale depends on many factors that are difficult to measure or absent from the data: the company’s positioning in its market, local or international economic context, or even the quality of the human relationship between the salesperson and the buyer. All these elements can influence the final decision without appearing in the model.

Language, a more stable playground

The goal would therefore be, ideally, to find a domain where predictions would be more reliable.

One such domain is language, LLMs (Large Language Models), which focus on a very specific task: predicting the next word from a given text and its context. This objective, apparently simpler than predicting the conclusion of a sale, has a major advantage: it is more stable, more constant, and above all more universal.

And it’s precisely this simplicity that makes them powerful: these models manage to understand the meaning of questions, grasp the provided context, and formulate coherent and adapted responses, as if they were reasoning — while they’re simply continuing a sentence in a statistically plausible way.

Continuity in fundamental principles

What strikes me in all this history is how much the initial theoretical principles — whether for neural networks or generative models — have been both clear, precise, and surprisingly durable. We also realize that, despite the magic effect, the objectives of these technologies remain relatively simple: predicting a word, an image, a behavior… And yet, these systems, once scaled up, become capable of performances that surpass us on certain points.

And now, heading for AGI

The next announced step is AGI, or Artificial General Intelligence — general artificial intelligence, that is, capable of understanding, learning, and reasoning in a broad, flexible, and transversal way… like a human.

So, where are we today? Do solid scientific foundations for AGI already exist? What are the theoretical principles that researchers are exploring? Do we have a clear vision of this revolution, or are we still navigating by sight?

This is what we’ll explore in the second part.