top of page

The Limits of AI Scaling

Updated: Sep 26

LLMs Just Can’t Get There from Here


Are the current approaches to Artificial Intelligence self-limiting?


Credit: © 2025 Andrew Borg + OpenAI Sora 1.0 + Google Gemini Nano Banana
Credit: © 2025 Andrew Borg + OpenAI Sora 1.0 + Google Gemini Nano Banana

Gary Marcus is a well-known American cognitive scientist, psychologist, and entrepreneur, widely recognized for his critical perspectives on artificial intelligence (AI). In his recent blog, "Peak Bubble," he argues that current valuations of companies in the Generative AI (GenAI) industry sector are irrationally inflated and are approaching a speculative bubble. 

How could this be? Aren’t we in an AI “boom time”? Vast amounts of money continue to pour into the sector, and the topic of AI has captured media, political, and popular attention around the world.

Marcus highlights a fundamental limitation in all of today’s Large Language Model (LLM)-based AI systems, suggesting they replicate patterns without truly grasping the underlying meaning or logic. He points to their persistent failures to reason reliably about relationships between components and their larger structure demonstrates a lack of deep understanding.


ree

He argues that AI's capabilities remain superficial, rooted in seemingly random distribution rather than substantive cognitive insight.

He also expresses concern about the financial projections of OpenAI, the most highly valued AI company currently, noting that the company, despite ambitious claims, does not anticipate profitability until 2030 – and reportedly lacks sufficient funds for major contracts. Ultimately, he warns against the unrealistic expectations surrounding GenAI's future revenue and infrastructure demands, suggesting the market is far exceeding any plausible future economic reality.

DEFINITIONS                          

Generative Artificial Intelligence (Generative AI, GenAI, or GAI) is a subfield of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data. These models learn the underlying patterns and structures of their training data and use them to produce new data based on the input, which often comes in the form of natural language prompts. Source: Wikipedia

Artificial General Intelligence (AGI)—sometimes called human‑level intelligence AI—is a type of artificial intelligence that would match or surpass human capabilities across virtually all cognitive tasks. Source: Wikipedia 

Artificial Superintelligence (ASI) is usually considered to be a consequence of accelerated development of AGI, and is thought of as a hypothetical agent that possesses intelligence far surpassing that of the brightest and most gifted human minds

The funding pouring into the current approach to generative AI, particularly the scaling hypothesis (“bigger is better”) is immense, with the total investment in this direction already surpassing the half-trillion-dollar mark. This investment comes despite the recent disappointing performance of models like OpenAI’s GPT-5. Common sense contraindicates this apparent commitment to ever bigger LLMs and the data centers that they require – with massive numbers of graphical processing units (GPUs) drawing ever more natural resources to power their insatiable hunger – to somehow magically extract human-level intelligence from a straightforward scaling of the LLM approach.

Current and upcoming LLMs powering generative AI platforms like GPT-5, xAI, Claude, and Gemini are simply insufficient, not broken. They are viewed by many critics as having "hit a wall" due to core structural flaws and persistent errors/confabulations/hallucinations, suggesting that "pure scaling simply isn’t the path to AGI". 

ree

It’s important to understand the differentiated functionality of GenAI (i.e. the current models) vs. the long-term goals of Artificial General Intelligence (AGI). According to astrophysicist and author Neil deGrasse Tyson, LLM systems can only produce self-limited information: as they “can only know what already exists on the internet.”


Moreover, Marcus highlights another fundamental limitation in all of today’s LLM-centric GenAI systems, suggesting they replicate patterns without truly grasping the underlying meaning or logic. He points to the inability of today’s AI to reason reliably about relationships between components and their surrounding structure; a demonstration of their lack of deep understanding. He argues that today’s AI capabilities remain superficial, rooted in imitation rather than substantive cognitive insight.


GenAI vs. AGI:

Aspect

Generative AI (GenAI)

Artificial General Intelligence (AGI)

Purpose / Capability

Produces content; good at specialized generation (text, images, etc.) given prompts.

Broad intelligence; ability to perform any intellectual task a human can do; adapt, reason, generalize widely.

Current Reality

Exists now; many tools (ChatGPT, DALL-E, etc.) using GenAI are in wide use.

Largely theoretical / research goal; not yet achieved.

Risk / Challenge

Issues include inaccuracy (“hallucinations”), bias, misuse of generated content, intellectual property, privacy etc.

Additional challenges around alignment (ensuring AGI’s goals match human values), ethics, safety, control, unintended consequences. Bigger scope

Use / Applications

Content generation, creative tasks, tools that assist humans (writing, art, design, etc.)

Potential to outperform humans broadly; could do research, decision making, generalizing across domains without being retrained for each.

To address these fundamental shortcomings and maintain momentum on the path to Artificial General Intelligence (AGI) a "rethink" of the underlying approach is required.


What’s Needed for the Leap to AGI

© 2025 Andrew Borg + OpenAI Sora 1.0 + ChatGPT-
© 2025 Andrew Borg + OpenAI Sora 1.0 + ChatGPT-

A true AGI platform will need to integrate four core capabilities and attributes.


1. Hybrid Reasoning Systems

The most critical architectural shift required is moving away from the pure scaling approach, which relies heavily on attention mechanisms in LLMs. Current models laude their “Chain of Thought” as a breakthrough, enabling them to “think out loud” by generating intermediate reasoning steps. However, recent research at Arizona State University concludes that “Chain of Thought reasoning is ‘a brittle mirage that vanishes when it is pushed beyond training distributions’.” [1]

The next generation platform should synthesize reasoning, learning, and cognitive modeling, known as neurosymbolic AI. This integration is necessary because despite massive investment, scaling alone has not reduced hallucinations, and methods are needed which enhance accuracy and reliable decision-making.


2. Explicit World Models

A world model is used to simulate or reason about the external environment (external ‘reality’). Current LLMs are criticized for lacking genuine world models and comprehension. To overcome this limitation, a next-generation platform would integrate clearly represented and system-accessible (i.e. explicit) world models in its reasoning. Current models typically bury their world models implicitly in their neural network or other black box system.

Explicit world models will enable the AI to move past statistical mimicry to achieve deeper understanding of the real world. This is an area of active investment, with growth in this sector more than tripling in the last two years, accelerated in part by the global interest in AI embodied in physical systems (e.g. intelligent robotics).


3. Robust Generalization and Compositionality

Another fundamental weakness of current LLMs is their inability to broadly generalize. They often perform measurably worse when deployed in settings different from their training environment. This shortcoming has persisted despite decades of research and innovation.

The next generation platform must solve the problem of performance and reliability gaps when facing new situations, allowing the AI to reliably extend universals and concepts beyond its training space. Current systems exhibit failures in foundational areas that demonstrate a lack of deep knowledge. New platforms must reliably handle complex conceptual tasks, follow abstract and explicit rules accurately, and combine existing concepts in new ways, adapting to the task required.


4. Moral and Ethical Principles

Without intentional design, AI systems may perpetuate biases, infringe on privacy, or cause harm.

Development of AI that aligns with human values and ethical principles is required in order to

ree

minimize the potential risk for the evolution of ‘rogue’ AI systems which might value their own existence over the goal of human flourishing.

This underscores the critical importance of deliberate and thoughtful integration of ethical frameworks in AI development to ensure that these technologies serve the common good and

uphold societal values. Frameworks for moral and ethical relativism  (i.e. support for the myriad local cultural values and differentiation) would be a prerequisite before widespread implementation of moral principles in AGI make sense.

Charting the Path Forward

Current AI limitations don't signal the end of the revolution—they mark the beginning of a more mature phase. Bigger is not necessarily better—bigger and smarter is better. Smart organizations are already shifting from pure scaling approaches to hybrid systems that combine pattern recognition with logical reasoning. While the current $500+ billion investment bubble may correct, the underlying technology must and will certainly evolve. Companies that acknowledge today's constraints while building practical solutions—better reasoning systems, explicit world models, and ethical frameworks—will capture the real value as AI moves from impressive demos to reliable and sustainable growth enablement.

Composed by Andrew Borg, in collaboration with Google NotebookLLM, OpenAI GPT-5, Microsoft Copilot, and OpenAI SORA.


[2] Moral AI: And How We Get ThereJana Schaich Borg et al, Pelikan Books, Penguin Random House, 2024

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page