Commentary

Yann LeCun says LLMs are a dead end — is he right or just early?

Nov 17, 2025

Key Points

  • Yann LeCun argues large language models cannot reach artificial general intelligence through scaling alone, but modern LLMs already outperform humans on specialized tasks like International Mathematical Olympiad problems.
  • LLMs excel only in verifiable domains with clear correct answers like math and code, while struggling with creative and strategic work where ground truth is ambiguous or contextual.
  • LeCun's critique targets a specific subset of AI researchers rather than the field universally, as peers like Andrej Karpathy share his view that new architectures beyond the current paradigm are necessary.

Summary

Yann LeCun, an AI pioneer at Meta, has argued for years that large language models are a dead end for artificial general intelligence. The claim resurfaces regularly in profiles and tech discourse, raising the question of whether he's right or simply early.

LeCun's position is consistent: scaling LLMs further won't reach AGI or ASI because they lack genuine reasoning and intelligence. Modern LLMs perform better than most humans at specialized tasks, though. GPT models routinely win International Mathematical Olympiad gold medals, a feat that requires more than memorization.

LeCun frames himself as disagreeing with everyone, but that's imprecise. George Hotz made nearly the same claim on the Lex Fridman podcast in 2021, arguing GPT-6 won't be AGI and the GPT paradigm won't scale. Andrej Karpathy and others have echoed the need for new ideas. LeCun is disagreeing with a specific subset of AI researchers, not the field at large.

Karpathy's recent work on AI's economic impact offers a clearer frame. AI isn't like electricity or the industrial revolution. It's like a new computing paradigm where the relevant shift is what becomes automatable. In the 1980s, you could predict which jobs computing would eliminate by looking at how fixed their algorithms were: typing, bookkeeping, human calculation. Those tasks had easy-to-specify rules.

With AI, the predictive feature isn't specification. It's verifiability. Tasks that are verifiable, resettable, efficient to attempt, and rewardable through automated feedback can be solved by neural networks trained via reinforcement learning. Math, code, and puzzle-solving with correct answers progress rapidly. Tasks combining real-world knowledge, state, context, and common sense lag because they're hard to verify against an objective function.

This creates a jagged frontier of progress. LLMs excel at verifiable domains but struggle with creative and strategic work where ground truth is ambiguous. When asked to generate jokes in the style of "You're telling me a shrimp fried this rice," early LLM outputs were mechanical and unfunny. Newer models like GPT-5 with extended reasoning improve somewhat. "You're telling me a hand made this pasta" and "You're telling me a ghost wrote this book" show genuine structure, but the output remains brittle compared to human creative intuition.

LeCun's limitation is specific and architectural: LLMs excel where tasks are verifiable and fail where they aren't. Whether that ceiling matters for AGI, or whether new architectures and training paradigms will solve it, remains genuinely open.