Artificial intelligence is at a turning point. At this year’s AI Action Summit in Paris, Yann LeCun—often dubbed one of the “godfathers of deep learning”—challenged the prevalent notion that bigger Large Language Models (LLMs) alone will deliver human-level intelligence. Instead, he proposed a holistic approach based on world models, energy-based architectures, and hierarchical planning. In a dynamic talk entitled “The Shape of AI to Come!”, LeCun outlined why the brute-force scaling of LLMs is insufficient and how the next generation of AI can truly “understand” the world, rather than merely predict text tokens.
Below is an in-depth look at his vision, including its implications for complex domains such as healthcare and biology.
LeCun drew a clear line between current LLMs (like GPT-style transformers) and genuine machine intelligence. LLMs learn by predicting the next token—word or sub-word—based on vast amounts of text. This method has yielded impressive capabilities in language tasks. However, the speaker underscored a fundamental limitation:
In short, LLMs excel at regurgitation and pattern matching, but they don’t possess an internal representation of reality, such as a sense of how objects move or how diseases progress.
Instead of fixating on bigger LLMs, LeCun advocated for “world models”—systems that capture the underlying structure and dynamics of environments. These models enable:
LeCun’s assertion is that world models will allow AI to move from “memorizing and predicting” to “understanding and doing.”
A major highlight of LeCun’s talk was the transformative potential of such world-model-based AI in complex fields like healthcare, biology, and pharmaceutical research. His perspective dovetailed with five critical insights:
LeCun’s own framework, the Joint Embedding Predictive Architecture (JEPA), embodies his philosophy of discarding brute-force generative models in favor of learning directly in representation space. Instead of predicting every pixel in a future video frame or every token in a sentence, JEPA learns a more abstract, “critical” representation of reality—keeping only what matters for accurate prediction and planning.
LeCun introduced energy-based models as a way to transcend the limitations of simple token or pixel prediction. In these models, an “energy function” assesses how compatible a proposed explanation (or action) is with given observations. Instead of passively mapping inputs to outputs, the AI searches for an optimal set of actions or states with minimal “energy”—akin to how humans deliberate and plan before acting.
He highlighted a long-term objective: hierarchical planning. Much like a human who divides a complex goal (say, traveling from New York to Paris) into incremental steps, AI needs multiple layers of abstraction. Each level in the hierarchy handles progressively simpler actions, culminating in robust, real-world problem-solving.
While RL has been a cornerstone of certain AI feats (like AlphaGo), LeCun described it as too data-inefficient for real-world tasks, especially where high sample efficiency is crucial (e.g., robotics, clinical trials). World models offer a more direct path:
Perhaps one of the most forward-looking points was LeCun’s call for open-source AI platforms. He warned of a scenario in which a few large technology companies or geopolitical powers dominate the creation of foundational models, leaving the rest of the world to rely on “black box” solutions with limited transparency. A collaborative, open-source approach, he argued, would:
LeCun’s central thesis—“LLMs alone are not enough”—resonated throughout the summit. The ultimate promise of AI lies in understanding the world deeply enough to model it, reason about it, and plan effective actions. In healthcare, this shift could revolutionize how we predict and treat diseases. In robotics, it may finally bring about safe, reliable domestic assistants. In every domain, the marriage of model-based reasoning, abstract representation, and constrained planning charts a path toward AI systems that are not only powerful but also genuinely aligned with human needs.
“Blindly scaling up large language models,” LeCun emphasized, “won’t magically yield human-level intelligence. What we need is a true comprehension of reality—built on robust world models, hierarchical planning, and a more direct, energy-efficient learning paradigm.”
Yann LeCun’s talk at the 2025 AI Action Summit was a clarion call for reimagining AI research and deployment. Far from dismissing the utility of LLMs, he emphasized their limitations and the urgent need for AI that grasps causality and physical context. His JEPA framework and energy-based models represent significant steps in that direction, with profound implications for healthcare, biology, robotics, and beyond.