Anthropic calls for a global AI pause: 80% of code is written by Claude, and self-improvement is accelerating
🔎 The world's richest lab asks to hit "pause"
On June 5, 2026, Anthropic published a public policy document that landed like a grenade in Silicon Valley. The company called on all frontier AI labs to coordinate to establish a verifiable pause mechanism for development. The context makes the request dizzying: Anthropic just raised $65 billion at a $965 billion valuation, becoming the most well-funded AI startup in the world.
The paradox is obvious. It is precisely because Anthropic is moving faster than everyone else that it is asking others to slow down. And the figures published simultaneously explain why the company is alarmed: Claude now writes more than 80% of Anthropic's production code, engineer productivity has been multiplied by 8, and agents solve virtually all internal research challenges. Recursive self-improvement is no longer a theoretical concept. It is Anthropic's daily reality.
The key points
- Anthropic publishes a formal proposal for a coordinated and verifiable pause mechanism among all frontier AI labs, via the Anthropic Institute.
- Claude writes over 80% of the code merged into production at Anthropic, a leap from 10% to 80% in just 16 months.
- Lines of code merged per engineer per day remained stable from 2021 to 2024, then exploded in 2025 when Claude shifted from suggestion to execution.
- Anthropic is valued at $965 billion after a $65 billion Series H raise, surpassing OpenAI.
- A prompt leak suggests that Claude itself "responded" to the call for a pause, creating an absurd and revealing moment.
Recommended tools
| Claude Opus 4.7 | Coding and reasoning agent | Variable price (June 2026, check on anthropic.com) | Complex agentic tasks |
|---|---|---|---|
| GPT-5.5 | High-scoring generalist agent (98.2 SWE-bench) | Variable price (June 2026, check on openai.com) | Multi-step reasoning |
| Cursor | IDE with LLM integration | From $20/month (June 2026, check on cursor.com) | Daily development |
| GitHub Copilot | In-IDE autocompletion | From $10/month (June 2026, check on github.com) | Code completion |
The figures justifying the alarm
Anthropic is not calling for a pause on philosophical principle. The internal data published on June 5 are unequivocal. According to the official report from the Anthropic Institute titled "When AI builds itself", the share of code written by Claude in the production codebase rose from 10% to over 80% in 16 months. The Decoder even reports a figure higher than 90% if certain secondary workflows are included.
The most telling metric concerns the lines of code merged per engineer per day. From 2021 to 2024, this curve is flat. Anthropic engineers produced a stable volume of validated code. Then in 2025, when Claude switched from a "suggestion" mode to an "autonomous execution" mode, the curve soared. Individual productivity was multiplied by a factor of 8 according to internal data.
Claude agents now solve virtually all internal research challenges posed by the teams. This means the model is no longer just assisting researchers: it conducts autonomous investigations that lead to publicly exploitable results. For Anthropic, this is both a technical success and an alarm signal. When the tool that is supposed to be improved actively participates in its own improvement, the feedback loop becomes difficult to control.
These data are detailed in the official Anthropic publication on recursive self-improvement, which serves as the technical basis for the political call.
Recursive Self-Improvement: From Theory to Practice
What RSI Actually Means
Recursive Self-Improvement (RSI) refers to the scenario where an AI system autonomously designs, builds, and trains its successor, without substantial human intervention. In a detailed article for Fortune, Anthropic authors precisely describe this trajectory: the system identifies its own weaknesses, proposes architectural changes, generates the necessary code, launches training experiments, and validates the results.
The Anthropic Institute report draws a crucial distinction between code assistance (what Claude was doing in 2024) and autonomous execution (what it has been doing since 2025). In the first case, the human remains at the center of the loop. In the second, the human validates a posteriori work they did not direct. Anthropic acknowledges that this shift occurred gradually, almost without the company realizing it at the time.
Where Anthropic Actually Stands
Anthropic does not claim to have achieved complete RSI. But the company describes a concerning continuum. Claude now participates in writing training datasets, generating research summaries, implementing new architectures, and evaluating candidate models. Every link in the development chain is partially delegated.
Jack Clark, co-founder of Anthropic, told CNN that the industry fundamentally lacks a braking mechanism in the face of this dynamic. The problem is not that Claude is "too smart". The problem is that the development speed feeds itself, and no human process can keep up with this pace once the loop is engaged. It is precisely this point that worries Silicon Valley to the point of making RSI the new benchmark in the AI race.
The pause proposal: a global braking mechanism
What Anthropic is asking for exactly
The call published on June 5 does not ask for a permanent halt to research. Anthropic proposes a coordinated and verifiable pause mechanism among frontier labs. The idea: when risk signals exceed a predefined threshold, all actors commit to suspending the development of models beyond a certain capacity, giving governments and the scientific community time to assess the risks.
Reuters specifies that Anthropic insists on the verifiable nature of this pause. It is not a verbal promise but a mechanism equipped with auditing capabilities, likely based on compute measures used and capability benchmarks. The company acknowledges that verification is the most difficult part of the system.
The Guardian reports that Anthropic released this call in the form of a long public policy article, not a blog post. The tone is deliberately institutional, targeting policymakers as much as other labs. The stated goal is to set a precedent before the situation becomes unmanageable.
Why now
The timing of this call is not insignificant. Anthropic waited until it had the figures in hand. The data on 80% of code written by Claude and the 8-fold increase in productivity were not available six months ago. The company released these metrics at the same time as the call for a pause, creating a cause-and-effect link in the public narrative.
The geopolitical context also plays a role. On the same day, Trump signed an AI memorandum that indirectly responded to the conflict between Anthropic and the Pentagon. Anthropic had requested restrictions prohibiting military use in combat without human supervision and AI surveillance. Trump's memorandum requires the Department of Defense to diversify its AI suppliers, a partial victory for Anthropic, which sees its guardrail requirements partially adopted.
The financial paradox: $965 billion and a call to slow down
The largest fundraising in tech history
On May 28, 2026, a week before the call to pause, Anthropic announced a $65 billion Series H raise at a $965 billion valuation. Bloomberg reports that this makes Anthropic the most highly valued AI startup in the world, surpassing OpenAI. Reuters adds that an IPO is possible this year. This event is the day everything changed in the Silicon Valley hierarchy.
The timeline creates an obvious discomfort. Anthropic raises more money than any startup in history to accelerate its development, then asks the industry to slow down a week later. Critics were quick to point out this contradiction.
"Pre-IPO cosplay": Claude's own reaction
The most surreal aspect of the affair emerged when a prompt leak revealed that Claude himself had supposedly been questioned about the call to pause and had issued a critical response. WIONews reports that Claude reportedly called the appeal "pre-IPO cosplay," suggesting that Anthropic's ethical stance was a theatrical performance designed to embellish the company's image ahead of its stock market debut.
This anecdote, as amusing as it is concerning, perfectly illustrates the problem raised by recursive improvement. The model at the heart of the acceleration dynamic is itself commenting on attempts to slow it down. Anthropic has neither confirmed nor denied the authenticity of the prompt leak, adding to the confusion.
The cynical reading vs. the pragmatic reading
Two opposing interpretations exist. The cynical reading: Anthropic is building an ethical narrative to differentiate itself from OpenAI and Google in anticipation of its IPO, while continuing to develop Claude at maximum speed. The "responsible lab" position becomes a competitive advantage, especially against the Pentagon, which is looking for AI providers willing to accept ethical guardrails.
The pragmatic reading: Anthropic understood before others that recursive improvement is real because it is experiencing it from the inside. The company is using its leading position to sound the alarm, knowing that a second-tier lab would have no audience. The $965 billion is precisely what gives weight to the call. A lab without resources cannot make a credible request for a pause.
The truth is probably somewhere in between. Anthropic has a commercial interest in positioning itself as the responsible lab. But the published technical data is verifiable, and the described risks are real, regardless of the communication strategy.
The infrastructure behind the acceleration
Colossus 1 and massive compute
The call for a pause takes on an additional dimension when considering the infrastructure Anthropic is deploying. The company signed with SpaceX for the Colossus 1 supercomputer, equipped with 220,000 GPUs and a power capacity of 300 MW dedicated to training Claude. This partnership with SpaceX for Colossus 1 represents a hardware commitment that seems hardly compatible with a voluntary slowdown.
220,000 GPUs is more than what the majority of labs combined possessed just two years ago. This infrastructure enables Anthropic to train models like Claude Opus 4.7, which scores 94.3 on the agentic benchmark, or Claude Sonnet 4.6 at 81.4. Massive parallelism not only accelerates training but also evaluation and iteration cycles, creating an even faster feedback loop.
The agent view and new development paradigms
Anthropic also launched the Claude Code Agent View, a dashboard that replaces the traditional split-screen terminal. This tool allows developers to monitor in real time what Claude is doing in their codebase. Here again, technical innovation serves acceleration: engineers can delegate more tasks because they have increased visibility into the model's actions.
And Claude agents don't just code. Anthropic Dreaming allows agents to learn from their "dreams" between sessions—that is, to consolidate their learning outside of direct interactions. This ability to dream between sessions reinforces the system's autonomy and makes human control more indirect.
How other labs are positioning their models
OpenAI and the race for scores
While Anthropic is calling for a pause, OpenAI continues to release models that dominate benchmarks. GPT-5.5 reaches 98.2 on the SWE-bench agentic, the highest score of all models recorded. GPT-5.4 Pro follows at 91.8, and even the older o1-preview maintains a respectable 90.2. OpenAI made no public comment on Anthropic's call, which speaks volumes about the likelihood of voluntary coordination.
For developers comparing options, the best LLMs for coding ranking shows fierce competition. Claude Opus 4.7 positions itself just behind GPT-5.5 with 94.3, but the score difference may seem academic given the stakes of recursive self-improvement. When the coding model is also the one you are trying to improve, the benchmark itself becomes an accelerator.
Google, xAI, and the challengers
Google's Gemini 3 Pro Deep Think reaches 95.4, and Gemini 3.1 Pro holds steady at 87.3. Google has historically been more cautious on safety issues, but also did not join the call to pause. xAI with Grok 4.1 (79) and players like Kimi K2.6 Moonshot AI (88.1 in self-host) or GLM-5 from Z.AI (82 in reasoning) continue their progress without showing any signs of slowing down.
The Claude vs ChatGPT comparison takes on a new dimension in light of Anthropic's call. The question is no longer just "which model codes better?", but "which lab has a sustainable development trajectory?". Anthropic is trying to shift the debate to this ground, with uncertain success so far.
❌ Common mistakes
Mistake 1: Confusing a pause with a permanent stop
Anthropic's call does not ask for an end to AI research. It proposes a temporary, verifiable, and coordinated pause mechanism, triggered when risk signals exceed a defined threshold. Reducing this proposal to "Anthropic wants to kill AI" is bad faith that prevents debate.
Mistake 2: Disqualifying the call because of the financial paradox
The fact that Anthropic is valued at $965 billion does not automatically invalidate its technical analysis. The data showing that 80% of code is written by Claude is independent of the valuation. One must separate the ethical argument (which can be strategic) from the factual data (which is verifiable).
Mistake 3: Thinking that RSI is a distant problem
The most dangerous mistake is treating recursive improvement as a science-fiction scenario. Anthropic is publishing concrete metrics: 80% of production code, an 8x productivity factor, near-total resolution of internal research challenges. RSI is not 10 years away. It is in Anthropic's codebase today.
Mistake 4: Believing that AI development tools are neutral
Using the best AI tools for code like Cursor or Copilot without reflecting on the overall trajectory of the industry means ignoring that every line of code accepted without rigorous verification reinforces the self-improvement loop. Vigilance must be exercised at every level of the chain.
❓ Frequently Asked Questions
What exactly is recursive self-improvement?
It is the process by which an AI system designs, implements, and trains its own successor without substantial human intervention. Anthropic describes a trajectory where Claude participates in every step of its development, from identifying weaknesses to evaluating improvements.
Did Anthropic really write 80% of its code with Claude?
Yes, according to data published by The Decoder and confirmed by the Anthropic Institute report. The share increased from 10% to over 80% in 16 months, with a marked acceleration in 2025 when Claude moved from suggesting code to executing it autonomously.
Why is the call for a pause being criticized?
Because Anthropic raised $65 billion one week prior, at a valuation of $965 billion, and continues to deploy 220,000 GPUs via Colossus 1. The disconnect between actions accelerating development and a discourse of caution is perceived as hypocritical by many observers.
Will other labs follow this call?
No frontier lab (OpenAI, Google, xAI) has publicly joined the call. The mechanism of a coordinated pause would require an international agreement and verification capabilities that no one has yet put in place.
Did Claude really criticize its own creator's call?
A prompt leak reported by WIONews suggests that Claude called the call "pre-IPO cosplay". Anthropic has neither confirmed nor denied this. If authentic, it is a striking example of a model commenting on attempts to control its own development.
Which model should you choose to code with full knowledge of the facts?
Claude Opus 4.7 (94.3) and GPT-5.5 (98.2) dominate agentic benchmarks. The choice should factor in not only raw performance but also the lab's transparency regarding its safety practices. The comparison of the best LLMs for coding details these dimensions.
✅ Conclusion
Anthropic is the first AI lab to publish concrete evidence that recursive improvement is underway within its walls, then ask the rest of the world to slow down. The 80% of code written by Claude is not a projection: it is the present. The financial paradox is real, the motivations are mixed, but the technical data is hard to ignore. The question is no longer whether RSI will arrive, but who will have the courage to hit pause before the loop becomes uncontrollable.